Infant Motor Scale of Alberta : validation for a population of Southern Brazil

Instituição: Escola de Educação Física da Universidade de Federal do Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brasil 1PhD em Health and Human Performance pela Auburn University, Alabama, Estados Unidos; Professora Adjunta da UFRGS; Porto Alegre, RS, Brasil 2Doutoranda em Ciências do Movimento Humano na Escola de Educação Física da UFRGS; Professora da Universidade de Caxias do Sul (UCS), Porto Alegre, RS, Brasil ABSTRACT


Introduction
During the first years of life, motor behaviors provide information on the integrity and functionality of other systems, and changes in these systems become apparent over time (1) .Motor delays are the first signs of possible developmental disorders.Vulnerable children exposed to risk factors may resist the negative effects of this exposure if diagnosed early, since learning is enhanced as a result of brain plasticity (2) .Early identification of development and motor function levels and introduction of intervention strategies may optimize the outcome for patients, allowing for proper decision making about strategies for optimal performance (3) .
The evaluation of development is ineffective when clinical investigation is the only method used (4,5) .Reliable scales, with good sensitivity and specificity, should be used to detect neuropsychomotor abnormalities (5,6) .In Brazil, the challenge of diagnosing developmental changes is aggravated by the lack of normative data and instruments standardized and validated for early childhood.The Alberta Infant Motor Scale (AIMS) is an example of an instrument used for Brazilian children without proper validation (5,(7)(8)(9) .The use of this instrument without considering the required cross-cultural adaptations may lead to erroneous categorizations of motor developmental delays (10) .
The AIMS is an observational assessment tool used to measure gross motor maturation that evaluates the sequence of motor development and postural control of antigravity muscles in term and preterm infants in four positions: prone, supine, sitting, and standing (11) .The clinical feasibility and psychometric properties of the AIMS have made it a valuable tool for identifying motor delays and abnormalities, providing information to health professionals and family about acquisition of motor skills, monitoring motor performance over time, detecting subtle changes, and evaluating the effectiveness of interventions in children with neuropsychomotor delays or disorders (11,12) .The theoretical construct of the AIMS establishes this scale as a supporting tool for research, clinical practice, and intervention actions (10,(13)(14)(15)(16)(17)(18)(19)(20)(21)(22) .
Estimates of reliability and validity for the AIMS have yet to be made for Brazilian children.Instruments used to assess motor development, standardized in the source country, may have their results affected due to adaptation to other environments and diverse socioeconomic, ethnic, and cultural backgrounds (18,21) .In general, researchers tend to employ international norms and standards when assessing motor development in Brazilian children, thus hindering the generalization of results.There is a need for studies verifying the psychometric properties of the AIMS, since satisfactory levels of validity and reliability may not be achieved when the scale is used in culturally distinct populations.The objective of this study was to translate, adapt, and verify the validity of motor and construct criteria (internal consistency, discriminant validity, correlation with other tests, and predictive validity) of the Brazilian version of the AIMS.

Method
This study was approved by the Research Ethics Committee of Universidade Federal do Rio Grande do Sul, Brazil.A total of 21 professionals (4 translators, 9 physiotherapists, 6 physical educators, 1 nurse, and 1 pediatrician) participated in the process of translation, adaptation, and content validity of the Brazilian version of the AIMS.The sample was composed of 561 preterm and term children (291 boys and 270 girls) from nursery schools and primary health care units who were divided in a representative manner according to age group (Table 1).
Children were selected consecutively after permission was received from the institutions and informed consent was obtained from the parents/legal guardians.Inclusion criteria were age between 0 and 18 months and no previous participation in intervention programs.Exclusion criteria were musculoskeletal disorders, such as fractures, peripheral nerve injury, and musculoskeletal infection, among others.Sample size calculation was performed using Programs for Epidemiologists, version 4.0.For a 95% confidence level, a response rate of 50% and a 4% margin of error, approximately 600 children should be evaluated.
The AIMS is an observational assessment tool used to measure the development of term and preterm infants, from 38 weeks gestational age to 18 months corrected age (11) , comprising 58 items divided into subscales (prone, supine, sitting, and standing) that describe spontaneous movement and motor skills.The examiner observes the child, taking into account aspects related to weight bearing, posture, and antigravity movements (11,12) .The scale presents raw scores, percentiles, and categorization of motor performance classified as: normal (above 25%); suspect (between 25 and 5%); and abnormal (below 5%) (11,12) .We also applied the Child Behavior Development Scale (CBDS) (23) , standardized for Brazilian children aged 1 to 12 months, in which behaviors are classified as: axial and appendicular; spontaneous and stimulated; and communicative and non-communicative.This scale allows an estimate of the degree to which the child's behavior is consistent with expectations for his or her age level.
Translators and specialists in motor behavior translated and evaluated AIMS motor criteria.The translators worked first individually and then as a committee.The raters assigned scores for clarity and relevance individually.
After permission was received from the institutions, the parents/legal guardians were sent a consent form.The application of the scale took approximately 20 minutes per child in their homes or institutions of origin, with minimal manipulation.The sessions were videotaped for later analysis by independent evaluators.To analyze the temporal stability of the scale, part of the sample was retested (n=259) within 15 days.
To ensure an accurate and valid version of the instrument, a preliminary version (translation) was prepared, evaluated, and adapted to the Brazilian culture, including the analysis of content validity, objectivity, reliability, construct validity, and criterion validity (24)(25)(26)(27)(28) .Translations were performed (two independent bilingual translators) that resulted in two Portuguese-language versions.Based on the two translations, the instrument was back-translated into English by two other bilingual professionals, resulting in two new English-language versions of the test.The final version was established in consultation with the translators and the researchers.
Content validity was established by three raters with expertise in the area.Content validity index (CVI) and kappa coefficient of agreement were calculated and face validity Regarding reproducibility, well trained evaluators, experienced in the use of the instrument for more than two years, analyzed inter and intrarater agreement using the intraclass correlation coefficient (ICC), Friedman test, and Wilcoxon test.
As for test-retest reliability, 259 children were retested, within a 15-day interval, and scoring stability was evaluated after some period of time using the Wilcoxon t test, Spearman's correlation coefficient, kappa coefficient of agreement, and McNemar-Bowker test.
For the analysis of criterion validity, expressed as predictive validity, 28 children were longitudinally followed up monthly, including 5 assessments (G1: 5 boys and 3 girls aged 1 to 6 months at 1st assessment); 20 children were evaluated with the AIMS and retested after 6 months (G2: 11 boys and 9 girls aged 1 to 11 months).Data were analyzed using the Friedman test, chi-square test, and Pearson's correlation coefficient.
For construct validity, the following procedures were performed: (1) Analysis of internal consistency using the Cronbach's alpha index as an indicator for the test and items (n=561); (2) Analysis o discriminant validity using the Student t test, Pearson's chi-square test, and Pearson's correlation coefficient to compare preterm infants (n=124, aged 0 to 15 months, including 62 atypical -extreme premature -infants) with typical infants (born at term), matched for chronological age; (3) Correlation of AIMS with CBDS rating criteria of 40 children, 36 preterm and 4 term infants aged 0 to 12 months, using Spearman's correlation coefficient, Kendall's coefficient, kappa coefficient of agreement, and McNemar-Bowker test.

Results
In the unified version, the semantics of the items was maintained and only a few changes were necessary, such as the replacement of some words with more commonly used synonyms.Correction of technical terms used in the translations and adaptation of motor descriptors for each item were required to allow a better understanding by the target public.
As for content validity, with respect to face validity, 100% of health professionals assigned a Likert scale score of 5 in the analysis of AIMS items.Regarding content validity (Table 2), CVI agreement values ranged from α=66.7 to α=92.8 for clarity, with values above 0.98 for relevance.For both clarity and relevance, a significant and strong agreement was observed among raters (kappa coefficient, p<0.05).Regarding objectivity, ICC ranged from α=0.86 to α=0.99, indicating strong agreement among raters (Table 3).No significant differences (p<0.05) were found among raters.The analysis of intrarater reliability showed strong agreement, with ICC ranging from α=0.915 to α=0.993.
In the analysis of discriminant validity, term infants had significantly higher scores than preterm infants, both in total score (p<0.001) and percentiles (p=0.04).A significant and strong correlation was found between scores and percentiles in both groups (p<0.001).When comparing rating criteria, the term group was significantly associated with the normal range, whereas the preterm group was associated with suspected delay or delay (p=0.047).
A significant and moderate positive correlation was also detected between the AIMS and CBDS, confirmed by Kendall's coefficient (W=0.319;p=0.02) and kappa coefficient (0.309; p=0.003).The McNemar-Bowker test (p=0.047)revealed significant differences between instruments: the AIMS rated a greater number of children as "normal" and the CBDS rated significantly more cases as "delay".
The analysis of predictive validity in the group of children followed up longitudinally revealed a significant and strong  positive joint correlation for the following variables: final score (ICC=0.96,range: 0.89-0.99;p<0.0001); percentile (ICC=0.88,range: 0.67-0.97;p<0.0001); and performance rating (ICC=0.83,range: 0.49-0.96;p=0.001), indicating that increased values in the 1st assessment were correlated with increased values in the subsequent assessment, and so on.In the comparative analysis, the Friedman test revealed a difference in scores from assessments 1 to 5 (p<0.001).Scores from assessment 5 were higher than those from assessment 4, which in turn were higher than those from assessment 3, which in turn were higher than those from assessments 1 and 2. The percentile values in all assessments showed a significant and progressive increase (p=0.049).No statistically significant differences were observed when ratings from assessments 1 to 5 were compared (p=0.37).The analysis of children (n=20) evaluated twice within a 6-month interval revealed a significant and strong positive correlation between scores 1 and 2 (r=0.730;p<0.0001).The analysis of percentiles (r =0.22; p=0.347) and ratings (rho=0.26;p=0.269) revealed no significant results in both assessments.A significant difference was found between mean values of scores 1 and 2 (p<0.001): the mean value observed in total score 2 was higher than that of total score 1.All children had higher values at assessment 2 than at assessment 1.When comparing percentiles 1 vs. 2, percentiles at assessment 2 were significantly higher (p<0.01)than those at assessment 1.A direct comparison of percentiles at both assessments revealed that, out of 20 children, 18 (90.0%)had higher percentiles at assessment 2 and 2 (10.0%) had lower scores at assessment 2. The agreement analysis of performance rating (1x2) showed good (kappa=0.458)and significant (p=0.03)results.

Discussion
The analysis of the psychometric properties of the scale indicated good reliability, similar to that obtained in the Canadian population (11) .The independent back-translation resulted in a unified version in Portuguese of the AIMS, the Escala Motora Infantil de Alberta (EMIA).A comparison among the four translated versions, discussion, and analysis in consultation with all translators allowed a reduction of biases inherent in the process carried out with only one translator.
Regarding face validity, there was unanimous opinion among the professionals involved that the content was appropriate to assess motor skills of children in different postures.In the analysis of content validity, in relation to agreement among raters regarding CVI for clarity and relevance, there was strong agreement among raters, confirmed by kappa coefficient of agreement with similar responses (29) .The results indicate that the Brazilian version of the AIMS (EMIA) showed excellent content validity, with clear and relevant criteria, adequate representation of items in relation to concepts, and theoretical relevance.
The results also demonstrated strong agreement among raters (ICC: 0.86-0.99),high reliability, and agreement with values proposed by the author of the scale, who suggests values above 0.80 (12) , which appears to be sufficient to classify our results as correct (27) .
Regarding temporal stability of the scale, percentiles and performance rating criteria remained constant after some period of time.High percentiles and ratings in the test were significantly associated with high values in the retest.The Brazilian version of the AIMS (EMIA) has reliability and temporal stability, with data showing high correlation and no significant differences between test and retest (29) .The phenomenon of acquiescence (whether positive or negative) was not observed, indicating reliable data (25) .The correlation values for percentiles (rho=0.85)and rating criteria (kappa=0.68)could have been even higher if a shorter time interval between test and retest had been employed, considering a 7-day period as a reference interval between assessments (11) .
The Brazilian version of the AIMS (EMIA) has high internal consistency; the values obtained using Cronbach's alpha (0.72-0.89) reflect a profile of high homogeneity among variables.The Cronbach's alpha coefficient should be at least 0.60, and the larger the sample, the more difficult it becomes to obtain consistent results (27) .Given the sample size, our results show homogenous items representing a same trait, measuring the same construct.
The Brazilian version of the AIMS (EMIA) was shown to be valid to distinguish atypical behaviors, because percentiles and total score were significantly different in the groups of preterm and term infants investigated.The results indicate that a high total score was associated with high percentile values, with a positive correlation between two measurements of a same construct, thus confirming convergent validity (27) .Considering rating criteria, discriminant validity was confirmed, establishing a difference between term and preterm groups.Results from previous studies confirm the ability of the scale to diagnose motor disorders during the first years of life (15,16,18,22,30) .
A comparison between the Brazilian version of the AIMS (EMIA) and CBDS categorizations revealed a moderate correlation (rho=0.34;kappa=0.30;W=0.31), indicating poor concurrent validity, which is consistent with findings from studies comparing the AIMS with the Test of Infant Performance (correlations between 0.20 and 0.67) (17) and the AIMS with the Daily Activities of Infant Scale (30) .Campos et al observed moderate agreement of the AIMS with the Bayley scale at 5 months (k=0.503) and poor agreement at 10 months (k=0.209) (5).In contrast, international studies (11,13,15,21) and a Brazilian study (14) showed a high correlation of the AIMS with other motor scales.In the present study, the results obtained may be due to the fact that the CBDS has fewer items related to motor function (15 items) compared to the Brazilian version of the AIMS (EMIA) (58 items), thus compromising the reliability of results.Although the CBDS is applied with similar goals, this scale also evaluates social and cognitive aspects, thus providing a reduced number of items in the motor subscale.Both instruments have advantages and disadvantages.Therefore, professionals should be able to determine which one better meets their needs.
Our results suggest that the Brazilian version of the AIMS (EMIA) has predictive validity, since ratings were similar longitudinally (G1) and pre-and post-tests (G2).These findings are consistent with a study for validation of the instrument in the Canadian population (12) , but different from a study conducted in Taiwan, which found limited predictive validity for this scale to evaluate preterm infants (21) .
The increase in percentile values over time, within the age groups studied, highlights a greater sensitivity and accuracy of the Brazilian version of the AIMS (EMIA) to evaluate children aged 3 to 9 months, as demonstrated in a previous study (17) .
It can be concluded that the procedures performed to analyze the psychometric properties of the Brazilian version of the AIMS (EMIA) showed that the back-translation was efficient to avoid biases resulting from misunderstandings in English.In addition, the scale was recognized by experts as an efficient tool to evaluate motor function in children and proved to be a reliable and consistent instrument, with significant predictive and discriminant power.It is noteworthy, however, that the results for concurrent validity were limited, thus warranting a comparison of the AIMS with motor tests other than the CBDS.Regarding predictive validity, a longer follow-up period could have yielded different results, favoring the predictive power of the instrument analyzed in this study.
The results obtained can be reproduced in daily practice, confirming the reliability and content, criterion, and construct validity of the Brazilian version of the AIMS (EMIA), thus encouraging professionals to use this scale in the evaluation and planning of intervention programs.We highlight the need for normative studies involving Brazilian children, since the use of categorizations from other populations may not be suitable to interpret data obtained in Brazilian children, assuming that these categorizations may vary given the diversity of socioeconomic and cultural factors in the country.

Table 1 -
General distribution of the study sample according to age group (trimester) and sex was determined by health professionals (4 PhDs, 4 Masters, 1 specialist, 8 graduates).Each rater received the adapted version and used a 5-point Likert scale to assign scores regarding clarity and relevance of motor criteria.

Table 3 -
Intra and analysis of the objectivity of the Brazilian version of the Alberta Infant Motor Scale ICC = intraclass correlation coefficient.

Table 2 -
Content validity index (CVI) and kappa coefficient of agreement regarding clarity and relevance of the Brazilian version of the Alberta Infant Motor Scale IVC: Índice de Validade de Conteúdo.