Introduction
Motor disorders can compromise the functional performance of children in activities of daily living as well as school activities^{1}^{),(}^{2}. This impairment can persist into adulthood, excluding them from important everyday activities^{3}. In this regard, the early detection of these disorders using specific assessment tools permits intervention and minimization of these effects^{4}^{),(}^{5}.
One of the most common tests used to evaluate motor skills in children is the Movement Assessment Battery for Children (MABC)^{6}^{),(}^{7}^{),(}^{8}^{),(}^{9}^{),(}^{10}^{),(}^{11}, which is already in its second edition^{12}. The differences between the two versions include a broader age range, a reduction in the number of age groups (from four to three), revision and inclusion of items, presentation of an innovative interpretation method, inclusion of a more representative sample, and restructuring of subtests^{6}. This instrument is already used in Brazil^{13}^{),(}^{14}^{),(}^{15}^{),(}^{16}^{),(}^{17}, but few studies have evaluated aspects of its validity and/or reliability in Brazilian children^{18}^{),(}^{19}.
Validity studies can be divided into predictive validity, concurrent validity, content validity, and construct validity^{20}. According to these authors, predictive validity is established when the criterion is obtained after application of the test. Concurrent validity is evaluated when one test is proposed as a substitute for another. Content validity is established deductively by demonstrating that the test items are a sample of a universe of the object of study. Construct validity evaluates to what extent a measure varies from the construct for which it was elaborated^{21}. The lack of construct validity implies difficulties in the interpretation of the results obtained with a test^{22}.
One of the conditions necessary, although not sufficient, for construct validity is factorial validity, which refers to the study of constructs measured by a scale and their numbers, as well as the variables that compose the constructs^{23}. Construct is the name given to hypothetical factors that determine behaviors, which cannot be measured directly, only estimated^{21}. According to Kita et al.^{24}, factorial validity should be verified because it influences different aspects of the MABC-2, such as the agreement between domains and their items and the calculation of domain and total scores. Factorial validity of the MABC-2 has been evaluated in the studies of Wagner et al.^{25}, Silveira_{ 19) } and Hua et al.^{26}. However, all of them detected a problematic structure in the composition of three domains. In addition, regarding reliability, the MABC-2 exhibited the lowest Cronbach alpha (α=0.432) when compared to the Test of Gross Motor Development-2 and Escala de Desenvolvimento Motor (EDM)_{ (19) } . However, Kita et al.^{24} obtained high factorial validity of age band 2 of the MABC-2 in Japanese children.
Since factorial validation is one of the steps in the determination of the validity of an instrument, the objective of this study was to evaluate the multidimensionality of the MABC-2 (7 to 10 years old) in children from the metropolitan region of Recife-PE.
Methods
Participants
This was a methodological study for instrument validation, which involved 123 children from public and private schools in the metropolitan region of Recife-PE. All children, including 64 boys and 59 girls, ranged in age from 7 to 10 years ( x =9.0 years; SD=1.11 years). Weight ranged from 17.6 to 68.4 kg ( x =31 kg; SD=9.46 kg) and height from 113 to 160 cm ( x =133.09 cm; SD=9.78 cm). The children were evaluated in a private area of the schools at times agreed upon by the institutions, researchers and persons responsible for the children. Children with orthopedic, neurological or cardiac problems were excluded from the study.
The study was approved by the Ethics Committee of Complexo Hospitalar HUOC/ PROCAPE (Approval No. 171.473) on December 13, 2012. All children and their legal guardians received information about the study procedures and the responsible person agreed to the child’s participation by signing the free informed consent form.
Instrument
The motor performance of the children was evaluated using the MABC-2, age band 2 (7 to 10 years). The test consists of eight items divided into three domains: Manual Dexterity - MD (three items: posting coins - MD1, threading lace - MD2, drawing trail - MD3); Aiming and Catching - AC (two items: catching with two hands - AC1, throwing a beanbag onto a mat - AC2); balance (three items: one-board balance - B1, walking heel-to-toe forward - B2, hoping on mats - B3). The results (raw scores) were converted to standard scores according to the child’s age for each domain, which generated component scores and patterns for each skill. Summing these scores, the diagnosis of overall motor performance was obtained. Scores equal to or below the 5^{th} percentile indicated significant movement difficulty; scores between the 6^{th} and 16^{th} percentile indicated a higher chance of movement difficulty, and scores above the 16^{th} percentile indicated the absence of movement difficulty. In addition, weight was measured with a digital scale and height with an appropriate anthropometric instrument.
Procedures
The children were evaluated in the schools between April 2013 and October 2014 at times agreed upon by the institutions and researchers. The test was applied by two evaluators following the protocol suggested by the authors^{12}.
Data analysis
Factor analysis (FA) can be used to reduce the number of variables (correlated with each other), grouping them into few factors, as well as to study the underlying data structure^{27}. The data were analyzed by FA using the SPSS v. 17.0 program whose assumption is multivariate normality^{28}. Standard scores of each task were evaluated by the Kolmogorov-Smirnov test (p>0.05). Since the data did not show a normal distribution, principal axis factoring indicated in these cases was applied as extraction method^{29}.
Rotation methods, which are either orthogonal or oblique, are used to improve the interpretation of the results of FA. Among orthogonal methods, varimax rotation is the most commonly used, which simplifies interpretation by maximizing the sum of variances of loadings in the factorial matrix. This rotation results in very high factor loadings, i.e., close to +1 or -1, or very low loadings (close to 0), facilitating interpretation of the results^{30}. According to these authors^{30}, factor loadings of 0.3 to 0.4 are considered minimally acceptable for factorial solution. In the present study, subtests with loadings higher than 0.4 were considered significant. In addition, simple structuring of the components of the factors was performed in an attempt to eliminate possible cross-loadings (loadings present in more than one factor).
The fit of the FA model was verified using the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy, Bartlett’s test of sphericity and individual measures of sampling adequacy (MSA). The KMO sampling adequacy was defined as follows: < 0.5 - unacceptable; 0.5 to 0.6 - miserable; 0.6 to 0.7 - mediocre; 0.7 to 0.8 - reasonable; 0.8 to 0.9 - good; ≥ 0.9 - excellent^{31}. A 95% confidence level was considered for Bartlett’s test of sphericity and values < 0.5 were defined as unacceptable for MSA^{30}.
The number of factors was defined using parallel analysis (Monte Carlo simulation), with a 95% confidence interval. Factor loadings > 0.4 were considered significant. Parallel analysis is one of the most accurate methods^{32}, in which eigenvalues prior to rotation are compared with those originating from a matrix of random values of the same dimension. Eigenvalues of FA that are higher than those from the corresponding random data can be retained; lower eigenvalues may not be true^{33}.
The analyses were performed using the SPSS v. 17.0 program. The standard scores of each subtest were considered for analysis. In cases in which both limbs were tested (MD1, B1, and B3), the final standard score of the subtest was used, totaling eight components for analysis.
Results
One of the first steps of FA is to evaluate the correlations between the variables selected for grouping^{30}. Among the 28 possible correlations between variables, 17 (60.7%) showed significant correlations (Table 1).
MD1 | MD2 | MD3 | AC1 | AC2 | B1 | B2 | B3 | |
MD1 | 1.0 | |||||||
MD2 | 0.394* | 1.0 | ||||||
MD3 | 0.104 | 0.244* | 1.0 | |||||
AC1 | -0.087 | -0.114 | 0.089 | 1.0 | ||||
AC2 | -0.242* | 0.039 | 0.055 | 0.345* | 1.0 | |||
B1 | -0.026 | 0.078 | 0.312* | 0.257* | 0.321* | 1.0 | ||
B2 | 0.111 | 0.207* | 0.193* | 0.286* | 0.009 | 0.305* | 1.0 | |
B3 | 0.063 | 0.227* | 0.169* | 0.176* | 0.160* | 0.151* | 0.226* | 1.0 |
Note: *Significant values at p<0.05. MD1: Manual Dexterity 1; MD2: Manual Dexterity 2; MD3: Manual Dexterity 3; AC1: Aiming and Catching 1; AC2: Aiming and Catching 2; B1: Balance 1; B2: Balance 2; B3: Balance 3
Source: The authors
In addition, the execution of FA was justified by the KMO test of sampling adequacy (0.57) and the Bartlett’s test of sphericity (sig. 0.000). These tests demonstrated satisfactory values for continuation of the analysis and the existence of significant correlations between some variables. However, when individual measures of sampling adequacy (MSA) were used, according to the criterion proposed by Hair et al.^{30}, MD2 and AC2 were found to be problematic, with values less than 0.5 (Table 2).
MD1 | MD2 | MD3 | AC1 | AC2 | B1 | B2 | B3 | |
MD1 | 0.525^{a} | |||||||
MD2 | -0.388 | 0.495^{a} | ||||||
MD3 | -0.011 | -0.183 | 0.673^{a} | |||||
AC1 | -0.047 | 0.232 | -0.036 | 0.562^{a} | ||||
AC2 | 0.270 | -0.186 | 0.063 | -0.320 | 0.478^{a} | |||
B1 | -0.013 | 0.044 | -0.269 | -0.061 | -0.284 | 0.639^{a} | ||
B2 | -0.009 | -0.182 | -0.034 | -0.284 | 0.191 | -0.247 | 0.582^{a} | |
B3 | -0.002 | -0.168 | -0.077 | -0.104 | -0.097 | -0.006 | -0.128 | 0.751^{a} |
Note: ^{a}Individual measures of sampling adequacy (MSA) - diagonal. MD1: Manual Dexterity 1; MD2: Manual Dexterity 2; MD3: Manual Dexterity 3; AC1: Aiming and Catching 1; AC2: Aiming and Catching 2; B1: Balance 1; B2: Balance 2; B3: Balance 3
Source: The authors
An important decision to make in FA is the extraction of the number of factors^{29}. Parallel analysis (Monte Carlo simulation) was used for this purpose, which is one of the most accurate methods^{32}. According to this criterion, the extraction of two factors is recommended for the MABC-2 (Figure 1), contrary to the initial proposal of the instrument composed of three assessment domains.
Performing FA by varimax rotation, three of the subtests (B2, B3, and MD3) did not show significant loadings (>0.4) in their respective factors (Table 3). In addition, replication of the composition of the originally proposed instrument was not observed in either factor.
Subtest | Factor 1 | Factor 2 | Communality |
B1 | 0.580 | 0.169 | 0.365 |
AC1 | 0.565 | -0.065 | 0.323 |
AC2 | 0.531 | -0.131 | 0.299 |
B2 | 0.368 | 0.338 | 0.250 |
B3 | 0.304 | 0.273 | 0.167 |
MD2 | 0.009 | 0.668 | 0.446 |
MD1 | -0.217 | 0.561 | 0.362 |
MD3 | 0.275 | 0.349 | 0.197 |
Note: MD1: Manual Dexterity 1; MD2: Manual Dexterity 2; MD3: Manual Dexterity 3; AC1: Aiming and Catching 1; AC2: Aiming and Catching 2; B1: Balance 1; B2: Balance 2; B3: Balance 3. Extraction method: principal axis factoring; rotation method: varimax with Kaiser normalization
Source: The authors
Since two of the subtests did not show significant factor loadings in the model, consecutive analyses were performed removing variables one by one (Table 4). For exclusion, the first criterion observed in the subtests without representative loadings in the model was MSA: the one with the lowest value would be removed if it was less than 0.4, as suggested by Hair et al.^{30}. If MSA were higher than 0.4, the communality values would be observed: the subtest with the lowest variance explained by the factor would be excluded from the next analysis. Thus, new FA was performed, removing subtest AC2 since it obtained the lowest MSA value (< 0.4) (Table 2).
Model | Subtest | KMO | MSA | Communality | Factor loadings | ||
1 | 2 | ||||||
MD1 | 0.578 | 0.208 | 0.456 | ||||
MD2 | 0.566 | 0.768 | 0.863 | ||||
MD3 | 0.690 | 0.193 | |||||
AC1 | 0.637 | 0.593 | 0.294 | 0.498 | |||
B1 | 0.663 | 0.326 | 0.571 | ||||
1 | B2 | 0.693 | 0.325 | 0.552 | |||
B3 | 0.712 | 0.153* | |||||
Factor 1 | 28.791 | ||||||
% Variance | Factor 2 | 20.313 | |||||
Cum | 49.404 | ||||||
MD1 | 0.605 | 0.571 | 0.231 | 0.481 | |||
MD2 | 0.553 | 0.693 | 0.825 | ||||
MD3 | 0.654 | 0.202* | |||||
AC1 | 0.581 | 0.261 | 0.472 | ||||
2 | B1 | 0.636 | 0.381 | 0.617 | |||
B2 | 0.640 | 0.307 | 0.529 | ||||
Factor 1 | 30.672 | ||||||
% Variance | Factor 2 | 24.047 | |||||
Cum | 54.719 | ||||||
MD1 | 0.561 | 0.543 | 0.263 | 0.512 | |||
MD2 | 0.514 | 0.596 | 0.769 | ||||
AC1 | 0.561 | 0.316 | 0.532 | ||||
3 | B1 | 0.624 | 0.246 | 0.495 | |||
B2 | 0.584 | 0.418 | 0.607 | ||||
Factor 1 | 32.226 | ||||||
% Variance | Factor 2 | 28.817 | |||||
Cum | 61.044 |
Note: MD1: Manual Dexterity 1; MD2: Manual Dexterity 2; MD3: Manual Dexterity 3; AC1: Aiming and Catching 1; AC2: Aiming and Catching 2; B1: Balance 1; B2: Balance 2; B3: Balance 3. Cum: cumulative frequency; KMO: Kaiser-Meyer-Olkin measure of sampling adequacy; MSA: individual measures of sampling adequacy
*Criterion used for exclusion of the subtest.
Source: The authors
The data were submitted to two other analyses, excluding variables AC2 and B3. Model 3, composed of subtests MD1, MD2, AC1, B1 and B2, exhibited the highest percentage of variance explained by the factors and had all variables with representative loadings in their respective factors (Table 4), suggesting a better fit to the data of this study.
Discussion
Factor analysis permits to demonstrate the agreement of variables in measuring one or more common dimensions and can be used to describe the underlying conceptual structure of an instrument^{34}. The dimensions, which consist of a group of highly correlated variables, are considered to represent the dimensions within the data^{30}. Thus, an important step in FA is the observation of correlations between variables. Variables without significant correlations may not belong to the factors, whereas those with a large number of correlations may participate in different factors simultaneously^{30}. In general, the ideal is a correlation matrix in which most coefficients are higher than 0.3^{35}. This was not observed in the present study in which only five of the possible combinations of variables exhibited correlations above this value. This finding could have been one of the reasons for the low value of the KMO test which, however, is acceptable for this analysis. This test is influenced by these correlations, as well as by the sample size and number of variables and inversely by the number of factors^{30}. However, Bartlett’s test of sphericity confirmed the existence of at least one significant correlation between the subtests of the MABC-2.
Another factor that must be considered in FA is the size of the sample. In general, it is recommended that analysis be performed with number of observations ≥ 100, with a participant-variable ratio of 5:1 or higher^{28}^{),(}^{30}. Both criteria were met in the present study in which the total sample consisted of 123 children, with 8 variables in the model.
Analysis performed with eight subtests revealed the presence of variables with MSA values less than 0.5 According to Hair et al.^{30}, these measures can be used to identify possible variables to be eliminated from the model. These variables should be removed one by one, starting with the variable with the lowest MSA and recalculating the FA until the individual measures reach an acceptable level. However, the final decision about continuation should be made based on the level of association between the variable and the extracted factor, i.e., communality^{35}.
The data point to the existence of only two dimensions as indicated by parallel analysis (Monte Carlo simulation), contrary to the original proposal of the authors for the MABC-2. The alignment of subtests different from the expected and the better fit of the model after the exclusion of three subtests (AC2, B3, and MD3) suggest the possibility of adaptation of some items to improve the construct validity of the MABC-2, as demonstrated in the study of Hua et al.^{26}. In that study, the proposed model with three motor skill domains also showed unsatisfactory fit in age band 1 for Chinese children^{26}. However, the exclusion of the subtests “drawing trail” and “walking heels raised” resulted in a better fit of the data to the model in children of that country.
Silveira^{19}, who analyzed the same age range as the present study, also identified a different grouping of subtests in the three factors, in addition to low internal consistency (α=0.432). On the other hand, the study of Valentini, Ramalho and Oliveira^{18} indicated good construct validity. However, the assessment was only based on Cronbach’s alpha, which was 0.78.
Although the study of Wagner et al.^{25} provided evidence that confirmed the factorial validity of the MABC-2 in German children (age band 2), problems were identified in its substructures. The authors indicated subtests MD3, AC1, B2 and B3 to be poorly reliable since less than 40% of their respective variances were explained by the factors. Similarly, three of the four variables considered in the cited study^{25} to be less reliable also had lower variances explained by the factorial solution (B2, B3, MD3), demonstrating nonsignificant factor loadings in the analysis with eight subtests (Table 4).
The disagreement in the grouping of variables of the MABC-2 proposal suggests the impossibility to use component (MD, AC, B) scores for interpretation. However, a scale does not have to be multidimensional, i.e., contain subscores measuring various components. It can be unidimensional and contain a final score to evaluate a given phenomenon. For this purpose, the results of FA should indicate the existence of a single factor and its scale will thus have factorial validity^{23}. However, the present study showed that the MABC-2 is multidimensional since parallel analysis demonstrated the existence of two factors, which differ from the original proposal of the instrument and the structure of the original instrument could therefore not be confirmed. An instrument is developed from a theoretical framework so that, through operational indicators, a broader construct can be known. Validity tests are then used to demonstrate how much of the evidence and theoretical support are present in the instrument developed^{36}. Although the proposal of the motor skill test battery seems attractive and some subtests appear to capture some of the components of motor performance, the psychometric properties of the instrument does not favor the tool. The lack of this validity suggests the need to revise the composition of some items of the test so that it becomes adequate to evaluate Brazilian children, in addition to compromising other types of validity (divergent, concurrent).
Practical implications
The divergence in the multidimensionality of the MABC-2 indicates that the domains intended for the instrument do not function as predicted by the authors. This can compromise the practical utility of the test since the capacity to evaluate motor performance as predicted was not confirmed. This fact does not imply that the result of the MABC-2 is invalid, but the results of its subscales (manual dexterity, aiming and catching, and balance) should be interpreted with caution since they may be prone to errors, considering that the results showed a structure different from that proposed by the authors of the test.
It should also be noted that the construct validity in FA should not simply be assumed if the instrument appears to have an adequate factorial structure. In fact, construct validity must be based on a nomological context that includes consistent data, which give rise to the impression of acceptance of the construct validity of a given measure^{21}.
Limitations of the study
Although this study corroborates data of previous studies, some limitations should be addressed. Since Brazil is a very large country with many cultural influences, it is interesting to evaluate the transcultural validity of the MABC-2 in different regions of the country using larger samples. Other techniques might be applied to analyze the degree of adjustment of the initial model of the MABC-2, such as techniques employing semi-structured equations. Other age groups (age bands 1 and 3) should also be included in subsequent studies.
Conclusions
The MABC-2, although commonly used in Brazil, had its multidimensionality confirmed but diverges from the structure proposed by the authors. The data revealed problems in the correlation between variables, in the number of factors extracted, and in the grouping of subtests. The exclusion of three subtests (MD3, AC2, and B3) from the analysis resulted in a better fit of the model, suggesting the possibility of modifying these activities for application of the MABC-2 to Brazilian children. However, further studies involving a larger sample and using other analysis techniques, as well as investigating other age groups, are needed.