Auditory perceptual performance of children in the identification of contrasts between stressed vowels

Purpose: To assess the auditory perceptual performance of children in a task of identification of vowel contrasts, to classify which phonemes and vowel contrasts provide higher or lower degrees of difficulty, and to verify the influence of age in this performance. Methods: Data recordings of auditory perceptual performance of 66 children in a task of identification using the software Perception Evaluation Auditive & Visuelle (PERCEVAL) were selected from a database. The task consisted of presenting sound stimuli through headphones to children, who would then choose, from two pictures arranged on the computer screen, the one corresponding to the word they heard. The time between auditory inputs and the child’s reaction was automatically computed in the software. Results: The perceptual accuracy was 88% and we found a positive correlation with the variable age. The time of response was significantly longer for incorrect answers as opposed to correct answers (p=0.00). Different degrees of similarity in auditory perception were observed, where front vowels were similar more often than back vowels. The tendency for errors was prevalent in the range of non-peripheral to peripheral vowels, which suggests that the latter may serve as a reference or perceptual anchor. Conclusion: The auditory perceptual ability concerning the identification of vowel contrasts is not yet established in the age group studied. The auditory perception of vowel contrasts occurs gradually and asymmetrically, as the order of acquisition in terms of production and perception was not always the same.


INTRODUCTION
Studies on phonological acquisition have been mostly dedicated to speech production, that is, when and how children learn to produce target sounds in their language (1)(2)(3) .
At production level, the following order of phonological acquisition is well-established: vowels, plosive, nasal, fricative, and liquid sounds (4) .Within each of these groups, a gradual acquisition is also predicted (4) .In the case of vowels, the segmentals composing the basis of the vowel triangle (/a/, /i/, and /u/) are expected to be acquired before the high mid-vowels /e/ and /o/ and, finally, the low mid-vowels /Ɛ/ and /ɔ/.For obstructive sounds, labial precede coronal ones, which in turn precede dorsal ones.Besides that, voiceless precede voiced obstructive sounds.When it comes to nasal sounds, /m/ and /n/ are first acquired, /ɲ/ being the last one established.Finally, the order of acquisition of liquid sounds alternates between lateral and non-lateral, /l/ being the first, followed by /R/, /ʎ/ e/r/.
The national literature lacks studies that investigate and describe the process of phonological contrasts acquisition from the point of view of perception.Usually, we found descriptions of the development of auditory perception skills of babies (5,6) and experiments of auditory perception with adults (7) .The only exception is a study about the acquisition of auditory perception of contrasts between occlusive vowels of the Brazilian Portuguese (BP) (8) .
When it comes to the development of auditory perception skills in children, we put the spotlight on a study (5) whose authors sum up the main linguistic achievements of babies in their first year of life based on reviews of international research.In the mentioned study, babies are sensitive to prosodic markings in their parents' speech right after birth, as well as to phonetic differences, especially those related to syllabic (stressed × unstressed syllables), vowel, and consonant contrasts.However, this innate capacity decreases in the first year of life.The conclusion of the study in question is that children give more attention to prosodic markings in the first months after birth and, at eight months of age, they show more interest for syllabic sequences.
Other work (6) also made reference to the development of auditory perception skills in babies.The author highlights the fact that 36-40-week-old fetuses can already perceive changes in syllabic order (i.e. between babi and biba).Additionally, based on experimental descriptions found in the international literature, she calls attention to babies' skills to detect phonetic contrasts even in their first weeks of life.
It is worth noting that there is evidence, in the international literature (9)(10)(11) , of a reorganization in auditory perception in the end of the first year of life, when sensitivity to non-native contrasts decrease, and children's attention is drawn to contrasts in their own language.
The author of a classical international research paper (12) reported, based on the studies by Shvachkin (1948/1973), experiments with children around the age of 2-3 years which showed gradual acquisition of segmental contrasts perception in their mother language.In agreement with Shvachkin, they described that the achievement of auditory perception skills of certain native contrasts tend to be more successful than others, thereby proposing an order of acquisition similar to that of production, established by Jakobson (1941/1968).
But we infer the existence of a process of acquisition of auditory perception from the literature, once there seems to be reorganization in this skill with aging, that is, as the child grows, their sensitivity to non-native contrasts decreases and their ability to distinguish native contrasts increases.
As mentioned before, the national literature lacks studies on phonological acquisition, for most of them are directed to acquisition from the perspective of production rather than perception.So the present study aimed at investigating the auditory perception performance of children as to stresses vowels of BP. Specific objectives were: (1) to verify auditory perception of children by the use of a task of identification of phonological contrasts of BP; (2) after assessing auditory perception as to stressed vowels, we tried to identify which phonemes and vowel contrasts were related to higher or lower degrees of difficulty by the children; and (3) to evaluate whether age is related to auditory perceptual accuracy.
The choice of investigating auditory perception of vowel contrasts is justified by the fact that this sound group is the first to be acquired in the process of phonological acquisition in terms of production.
Therefore, we believe that this investigation may bring valuable scientific contributions such as (1) help to understand the development of auditory perception of children; (2) provide more information about the perception of phonological contrasts of stressed vowels in BP; and (3) improve clinical assessments of this performance in patients with phonoaudiological disorders.

Sample
After approval by the Ethics Committee of Universidade Estadual Paulista "Júlio de Mesquita Filho" (UNESP) (protocol 132/2010), information about the auditory perceptual performance of 66 children in identifying stressed vowels of BP was selected from a database of a Research Group on Language, consisting of 39 males and 27 females ages from 5 to 6 years (mean age of 66.53 months and standard deviation of 5.54).This age group was selected because the database was built from linguistic data of preschool children (36 years old), so we opted to select children in the age group prior to the last phase of elementary school.
Inclusion criteria were children who had been previously subjected to phonoaudiological and hearing screening.Hearing screening consisted of an audiometric evaluation performed by the physician Cláudia Vieira Cardoso (Audiologist at UNESP).To do so, she used an audiometer AD-28 Interacoustic, with TDH-39 headphones, in an acoustic enclosure installed in a room at the elementary school Sítio do Pica Pau Amarelo.Sound frequencies of 1000, 2000, and 4000 Hz were assessed with 20 dB NA (audible decibel level).To assure response reliability, auditory inputs were presented three times at each frequency.Children were considered to have been previously screened if they responded to at least two of the three inputs at each frequency for each ear.
Exclusion criteria were neurological or language disorders (detected upon phonoaudiological examination), as well as otological/auditory disorders (detected upon auditory examination), for these problems could interfere in their auditory perceptual performance.To detect neurologic and language disorders, the phonoaudiological assessments were performed individually, aided by a specific protocol adopted by the Speech-Language Therapy course at UNESP.
At first, 72 children were selected from the database, but six of them presented auditory alterations upon audiological screening and were, therefore, excluded from the sample, hence being referred to a more accurate audiological assessment.
The caregivers authorized children's participation in the study by signing an informed consent form.

Material
We used the Speech Perception Assessment Tool (13) aided by the software Perception Evaluation Auditive & Visuelle (PERCEVAL) (14) .
This tool was drawn up to assess the auditory perception of children (from 4 years old on) based on a task of identification -also called forced-choice task -of phonological contrasts of BP in syllabic onset.It gathers mainly dissyllable paroxytone words that are familiar to children and contain the 19 consonant phonemes of BP in stressed positions.
The criteria for word selection by the creator of the tool was: (1) presence of BP phoneme contrasts composing minimal pairs; (2) easy representation by images; (3) words belonging to children' vocabulary; and (4) words that were previously chosen in another study (15) .
The PERCEVAL tool is composed of a subgroup of four experiments: (a) PERCIvowels (which assesses the performance of children in identifying phonic contrasts between stressed vowels); (b) PERCocl (which assesses the identification of phonic contrasts between occlusive sounds); (c) PERCifric (which assesses the identification of contrasts between fricative sounds) ; and (d) PERCison (which assesses the identification of contrasts between voiced sounds).
Considering the purposes of this study, we only applied the PERCIvowels experiment.
In the construction of the PERCIvowels (Chart 1), 42 words composing minimal pairs (by combinatory analysis: seven vowels × six possibilities of combination between them = 42 words, being 21 contrastive pairs) were allocated.
It is worth highlighting that the inclusion of words such as "feira" in contrast with "fera", and "touca" with "toca" was based on the observation that some diphthongs present variance with monophthongs (i.e.p[ej]xe~p[e]xe, f[ej]ra~f[e]ra, etc.) while others do not (i.e.r[ej]tor~r[e]tor).Bisol (16) suggested that the first diphthongs presented should be false or light, related to one V element only, while the second ones should be the true diphthongs, related to two V elements.
The words composing PERCIvowel were recorded in high performance equipment in acoustic enclosure by an adult BP speaker.This adult was asked to speak the target words in a phrase ("Say target word to him"), so that the typical ascending curve of production by the repetition of isolate words could be avoided.At the end of the recordings, aided by the software PRAAT (17) , minimal pairs were selected from phrases, thus constituting the auditory inputs of the experiment.
In parallel to audio recordings, we selected images representing each word by research at Google Images (http://images.google.com.br/), in public domain.Aided by the software Paint, the images were cropped and edited in a standard way and composed the visual inputs of the experiment.
Once auditory and visual inputs were established, we organized a script to be followed upon using the software PERCEVAL during the experiment.

Experimental procedure
The experimental procedure was composed of one identification task (also known as forced-choice task) with three phases: word recognition, training, and test.
To perform the task, children were put comfortably in front of the computer (where PERCEVAL was running), wearing KOSS earphones, in an acoustic enclosure installed in their school.
The phase of recognition was the presentation of auditory and visual inputs to children, so that their familiarity (or not) with the selected words could be identified.After that, children's knowledge was tested.We adopted an 80% success rate criterion to decide if children could be conducted to the training phase and, later on, to the auditory perception test itself.All subjects were reported to recognize 80% of the words.
The training phase was automatically performed by the software to assure that the task was understood by the children.This was the task of perceptual identification, but without register of results by the software.The inputs were random and 10 presentations were selected.The test itself was initiated after a two-minute interval.
In the identification task, which involves training and test phases, children would listen individually (binaural presentation at 50 dB) to a word composing a minimal pair and, following, decide and chose the image representing that word from two images presented on the computer screen.For instance, the word "fogo" sounded for the child and, right after, two images, one representing "fogo" and the other "figo", would show up on the screen for the child to point out the one corresponding to the auditory input.Patterns of response were: success, when the child would choose the correct image; error, when the child would choose the wrong image; and non-responsive, when they did not react within a pre-established period of time.
The time of auditory and visual inputs (6000 ms) and the time of response (4000 ms) was controlled and automatically measured by the PERCEVAL software, so the children were considered "non-responsive" if they did not present any pattern of response within 4000 ms.However, when the child responded -successfully or with an error -the time for response was automatically computed by the software, ranging from 0 to 4000 ms.
The test took 15 minutes with each child.
In order to identify the auditory perceptual process involved in the task, children's reaction time was taken into account upon decision-making in the task.

Criteria for analysis
The following criteria were used for analysis: (a) auditory perception accuracy (rate of errors, successes, and absence of response); (b) reaction time for both errors and successes; and (c) ability to identify stressed vowel contrasts.In statistical analysis, the F test was applied to compare the reaction time for errors and successes, and the Spearman's correlation coefficient was used to compare the variables "age" and "auditory perception accuracy".
It is important to note that correlation is a measure between two or more variables, and the coefficient may range from -1.00 to +1.00, -1.00 being a perfect negative correlation and +1.00 a perfect positive correlation.Significance level was set at α<0.05, with a 95% confidence interval.
To assess children's ability to identify stressed vowel contrasts, we used a confusion matrix (18) to compute errors and successes quantitatively and qualitatively.This kind of analysis provides information about most and less difficult contrasts and about recurrent error patterns.
The mean reaction time was 2243.83 ms for errors and 2158.31ms for successes.Student's t-test could not be applied to compare mean times because the distribution variance was not homogenous, hence not meeting one criterion for its use.The errors and successes variance was compared by the F test, where we found a statistical significant difference (F=4.23,p=0.00) (Figure 1).
The vowel contrasts presenting higher or lower degrees of difficulty in auditory perception are listed in the confusion matrix (Table 1).
Furthermore, the error pattern was organized by three parameters: (1) vowel pitch (high, mid, or low); (2) anterior/ posterior direction (classified in anterior, central [vowel /a/], and posterior); and (3) error trend within the vowel range, that is, peripheral to non-peripheral vowel and vice-versa.
The error trend within the vowel range was established in each pair in order to identify the prevalence of a certain direction.For instance, for the pair /i/~/e/, /e/~/i/ the number of errors was evaluated from peripheral to non-peripheral (/i/~/e/) and vice-versa (/e/~/i/).As a result, we found a prevalence of errors in the non-peripheral to peripheral direction (58.33%) as opposed to the other (41.66%).
Finally, the variables of age and accuracy were compared using the Spearman's correlation coefficient, where we found a positive correlation between age and auditory perceptual accuracy (r=0.41,p=0.00).Therefore, we may infer that children's performance tends to improve with age (Figure 2).

DISCUSSION
Two aspects deserve attention when it comes to auditory perception of the BP stressed vowels: the first relating to success rate (88%), thus suggesting that the studied age group has no effective skill or perfect accuracy to identify stressed vowels.These results agree with previous studies (9)(10)(11) that described auditory perceptual accuracy in the mother language of children up to the age of 7 years (11,19) .The second aspect is related to the comparison of accuracy in the identification of vowels and occlusive consonants in another study conducted with children in the same age group (8) .
It appears, therefore, that the accuracy rate in identification of occlusive consonants was 85 against 88% for vowels.Despite the three percentage points of difference, one can conclude that the auditory perception not only depends on the phonic group, but also seems to be more accurate in vowels.This hypothesis is confirmed by a study (20) that converted the percentage into a sensibility index named d-prime.
The successes in the identification of vowels may be explained by their acoustic features (21) .Vowels are long-lasting segmental bearing more acoustic energy, besides presenting frequency strengthening (formative) in a range that is audible by the human ear, hence favoring perception.Regarding reaction time during the task, the assumed prediction was based on a classical study (22) ; therefore, the greater the acoustic difference between two input pairs, the faster the response (shorter reaction time), and the smaller the acoustic difference between inputs, the longer the reaction time.To an arithmetic mean extent, the  results agree with the hypothesis that errors could happen in longer reaction time when compared to successes.One may thus infer that vowel contrasts related to errors (such as /i/~/e/, /e/~/ɛ/) present auditory perception similarities that demand a longer reaction time during the psycholinguistic process before decision making.
When it comes to vowel contrasts presenting higher or lower degree of difficulty, we found differences in children's performance as to contrast pairs and phonemes individually.More specifically, when we analyze the most recurrent error patterns as to vowel pitch on the vowel triangle in a back-to-front direction and the direction of the error within a vowel range, that is, peripheral to non-peripheral vowels and vice-versa, two findings are relevant: the first relates to a gradual improvement in performance in accordance with the contrast in question, that is, the auditory perception in the task in question was not similar or comparable to all contrasts, which suggests different degrees of similarity between the BP stressed vowels.Our findings show that the most similar contrasts were /i/~/e/, /e/~/ɛ/, /ɛ/~/ɔ/, and /o/~/u/ and the less similar ones were /i/~/o/, / ɛ /~/o/, and /ɛ/~/u/.
Different degrees of similarity between vowels have been reported in the international literature.A study on the acoustic features of the vowel range in American English and their relation to vowel identification (23) not only showed different accuracy rates in auditory perception, but also highlighted that most errors involve the minimal pairs /ae/~/e/ and /ɑ/~/ʌ/ due to their proximity in the vowel quadrilateral.
More recently, in a research about the perception and production of vowels from German by monolingual native children and Turkish children in process of acquisition of the German language (24) , the authors reported different rates of perception in both groups as to the studied minimal pairs and concluded that the ability to distinguish sounds decreases as the auditory perception increases.
Beyond the observation of different degrees of auditory perceptual similarity between vowels, the second finding that warrants attention is the fact that this similarity is not symmetrical in the front and back axis and in the vowel range direction.
At the front and back parameter, anterior vowels (/i/, /e/, and /ɛ/) were related to a higher rate of errors compared to central (/a/) and posterior vowels (/ɔ/, /o/, and /u/).Similarly, when it comes to errors within the vowel range, the prevalence was in non-peripheral (mid-vowels) towards peripheral ones.
Asymmetry related to the identification of vowels, as well as the analysis of its direction, has been described in the literature in terms of production (25) and perception (23,(26)(27)(28)(29) .Regarding vowel perception by children, some authors reported, mainly in the 1990s, the presence of asymmetry (26) .When investigating identification of German minimal pairs /u/-/y/ and /ʊ/-/Y/ by 6-12-year-old German children learning English, we observed an asymmetry in perception, that is, better performance in /y/ to /u/ (instead of /u/ to /y/) and in /Y/ to /ʊ/ (instead of /ʊ/ to /Y/).
Another study (27) confirmed the asymmetry in the perception of vowel contrasts by children of the same age group in their native language and in a language other than theirs.In 2001, a study (28) was conducted to assess the direction of asymmetry in the perception of vowels by children.The results confirmed the hypothesis that asymmetry within the vowel range occurs from non-peripheral towards peripheral ones, for the latter would serve as reference vowel sounds.
Subsequently, an extensive literature review (29) about this asymmetry showed that the direction (non-peripheral towards peripheral vowels) is present in children and in adults.According to the authors, the hypothesis for the existence of perceptual asymmetry is that inputs are not equally stressed within a domain of perception.Also, the peripheral vowels are said to function as a kind of "anchor" in the auditory perception task, which the authors call natural perceptual magnets.
The national literature holds a study (25) about the acquisition of vowel sounds in BP and typologies of languages in which the author recognizes the asymmetry as a factor supporting the hypothesis of universality in the building of phonological inventories and in the phases of the process of BP vowel sounds acquisition, especially as to low mid-vowels.Trying to explain and describe asymmetric vowel systems (25) , the author mentions relations of markings in vowel segmental, proposing the following order: |labial>coronal>dorsal.It means that when it comes to vowels, the [dorsal] is the most harmonic place of articulation, followed by [coronal] and [labial], the less harmonic one.
Extending these results as to production to this study as to perception, there seems to be an agreement with the anteroposterior parameter.As previously described, the errors in vowel identification was mostly related to anterior vowels (which present the [coronal] feature) rather than to posterior ones, which involve features concurrence ([dorsal/labial]).
Finally, the age of children was shown to be an important factor in auditory perceptual performance, as younger children present less accuracy compared to older ones, and aging seems to be related to an improvement in accuracy.This result agrees with previous findings (8)(9)(10)(11)19,(30)(31)(32) , which suggest that the auditory perceptual acquisition of phonological contrasts takes place gradually.
In short, the results suggest that the auditory perception domain regarding vowels is gradual and asymmetric.Therefore, the speech therapist must consider different degrees of similarity and stress auditory perception between minimal pairs while dealing with both production and auditory perception, so that the factors favoring or not the emergence of a certain contrast be identified.

CONCLUSION
The findings of this study show that there is not an effective domain or perfect accuracy as to the identification of stressed vowels by 5-6-year-old children yet.They also support the hypothesis that reaction time during the task is related to different degrees of similarity between inputs, which means that greater the acoustic difference between minimal pairs, the children's response is faster (the shorter the reaction time).Also, the smaller the difference between inputs the longer is the time of response (reaction time).
Beyond the finding of different degrees of auditory perception similarity between vowels, we also observed an asymmetry related to the front and back place of articulation in the vowel range.
Posterior vowels present more similarity compared to posterior ones.Peripheral vowels (occupying the extremities of the vowel triangle) seem to function as an "anchor" in the task of perception, as the error tendency was non-peripheral toward peripheral vowels.The parallelism in the order of acquisition in production and in perception was not always the same.
Children's age was positively associated with auditory perceptual accuracy, which suggests that the skill to identify occlusive contrasts improves with age.
This study should be extended to the investigation of phonological contrasts identification, but involving other sound groups.The sample should also be larger and comprehend other age groups.
*LCB was responsible for the study design, general orientation for study conduction, and writing of the paper; LMRR was responsible for data collection, tabulations, and organization.

Figure 1 .
Figure 1.Comparison between time of response and errors or successes

Figure 2 .
Figure 2. The association between age and accuracy in auditory perception

Table 1 .
Confusion matrix for the task of vowel contrast identification