Correlations between vocabulary and phonological acquisition : number of words produced versus acquired consonants Correlações entre aquisição do vocabulário e da fonologia : número de palavras produzidas versus consoantes adquiridas

Accepted: December 22, 2015 Study carried out at Center for Language and Speech Research, Universidade Federal de Santa Maria – UFSM Santa Maria (RS), Brazil. 1 Universidade Federal de Santa Maria – UFSM Santa Maria (RS), Brazil. Financial support: Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul (FAPERGS). Conflict of interests: nothing to declare. ABSTRACT


INTRODUCTION
In the complex and intriguing study of oral language, there are several parameters that can be considered in order to evaluate and analyze development.Therefore, researchers need to be insightful and creative to use different theories in an attempt to better explain acquisition, evolution and use, as well as differences and "flaws" when these are found.Consequently, in an attempt to circumscribe the object of study, the theme language is divided into systems: phonology, vocabulary/lexicon/semantics, syntax, morphology and pragmatics.This article will investigate typical acquisition of phonology and vocabulary of Brazilian Portuguese and relationships that occur between these systems in the course of acquisition.
Much is already known about the acquisition of Brazilian Portuguese consonants, especially in the dialectal variant of Rio Grande do Sul State, which can be exemplified by the studies conducted by researcher Regina Lamprecht and compiled in book form (1) .As for the ages of acquisition of the phonemes in the simple onset position, it has been found that at one year and six months of age, the child produces the phonemes /p, b, t, d, m, n/; at one year and nine months most of the back nasals and stops and the fricatives /f/ and /v/ are stabilized; at two years and two months the affricates and /s/ and /z/ appear; at two years and eight months the phonemes /Ʒ/ and /l/ appear; at two years and ten months the child has mastered /ʃ/; at three years and four months /R/ has been mastered; at three years and ten months /ʎ/ appears; and at four years /r/ appears.With regard to syllable structures, simple onset is the earliest structure, followed by coda with /S/, acquired at two years and six months and coda with /R/ acquired at three years and ten months.Complex onset only appears at the age of five, being considered, therefore, as late-acquisition (1) .
Another author (2) , also from Rio Grande do Sul, aiming to describe the phonological acquisition of Brazilian Portuguese in children with phonological deviation, created the Implicational Model of Feature Complexity (IMFC), which explains the acquisition of segmental consonants through implicational relationships between features, as illustrated in Figure 1.
The features that appear in the model are marked features, i.e. those with greater complexity, and the lines represent relationships that exist between these features, which can be strong or weak implicational relationship.The root corresponds to features that have zero complexity (/p/, /t/, /m/ and /n/ phonemes), which make up the basic representational structure given in universal grammar and present only unmarked features.From the root, or zero state, emerges a tree structure, in which the branches represent conditions of markedness and where the farther the branch is from the root, the more complex these conditions are.If in a same path, there are two or more features or combinations of features, this indicates that there is an implicational relationship between them.If one feature or combination of features is the target of two or more converging paths, this means that in order for one of these features to be specified, it is necessary for the set of features corresponding to these paths to have been specified (2) .
The study of vocabulary has not been as widely investigated as phonology has in Brazil.In Brazilian Portuguese, it was found that between one year and four months and one year and six months, the mean growth of vocabulary is four words per month, while between one year and ten months and two years, this growth is 25 words per month (3) .
Studies that correlate different fields of language, such as vocabulary and phonology, are even more scarce.One study suggested that changes in vocabulary observed in children with specific alterations in language development can be explained by difficulties observed in skills and/or characteristics directly related to mechanisms involved in information processing, which hinder the quality and recovery of phonological and semantic representations of a new lexical item (4) .
Another study conducted in 2012 (5) , with 15 children with typical development at two years of age, used a spontaneous naming method for words the children already knew in order to analyze the influence of frequency of lexical types, similarity of phonological types, age of acquisition and phonotactic probability on the variability and precision of production.The authors found that phonological complexity plays an important role in that words with late-acquisition phonemes and syllable structures are produced with greater variability and both the frequency of lexical types and the phonological similarity influence the observed variability in speech-the greater the effect of frequency of lexical type and phonological similarity, the less variability in productions.In addition, children with large vocabularies have phonological systems more complex than those with small vocabularies, since the larger amount of words produced can generate the requirement of a more advanced phonological system (6) .
A recent study (7) revealed that children in the early stages of phonological acquisition select the words they will say based on how "pronounceable" they are.Likewise, other authors (8) found that children who produce fricative phonemes early have better vocabulary and syntax skills than those presenting difficulties with fricatives.This article aimed to verify possible correlations between the number of lexical types and the number of consonants in the general phonological system of children with typical language development.

METHODS
This quantitative and descriptive study with a prospective data collection is part of a project approved by the UFSM Research Ethics Committee under the number 0219.0.243.000-11.Authorization from participants' guardians was requested upon clarification, reading and signing of the informed consent, an essential condition for participation in the study.
The 186 children who participated were aged between one year and six months and five years, 11 months and 29 days, all members of monolingual families who speak Brazilian Portuguese and presented typical language development.Exclusion criteria included: children who presented hearing loss or neurological, emotional, and/or cognitive impairment detectable by means of observation; the presence of motor or organic oral alterations; or those who were participating in speech therapy.
Nine children were between 1 year, 6 months and 1 year, 11 months, 29 days; 13 children were between 2 years and 2 years, 3 months, 29 days; 13 children were between 2 years, 4 months and 2 years, 7 months, 29 days; 16 were between 2 years, 8 months and 2 years, 11 months, 29 days; and there were 15 children in each of the following nine age groups (of three months each) starting with the group with ages between 3 years and 3 years, 3 months, 29 days, and going up to the last group between 5 years, 8 months and 5 years, 11 months, 29 days.
The data were collected at eight municipal elementary schools in the city of Santa Maria-RS, located in different regions of the city.The speech assessment consisted of a questionnaire for parents or guardians, an orofacial praxis assessment, an assessment of oral language and phonetic and phonological aspects of speech and a hearing assessment.
The questionnaire, sent home to parents and guardians, sought information about pregnancy, birth, language and motor development of the child, medical history, current behavior, history of bilingualism, and general aspects of family history and dynamics.
In the oral motor assessment, an orofacial myofunctional protocol with scores (AMIOFE) (9) was adapted to the needs of this research.This protocol was used as an oral motor assessment of appearance, normal position, muscle tension and mobility.Respiratory function was also analyzed.
Oral praxis in children who were three years and six months or older was assessed through a protocol for dyspraxia assessment (10) .Children under the age of three years and six months did not have this aspect evaluated because there are no reference values up to this age group.
Language assessment was performed using a behavioral observation protocol (11) , which allowed us to observe cognitive and language development.This protocol was designed for children between one and four years of age, is easy to apply and provides defined reference values.For children over four years, spontaneous conversation was observed through answers to questions and an analysis of small spontaneous oral narratives.
The assessment of phonetic aspects of speech for children three years and six months or older was based on repetition of phonetically balanced words, allowing detection of articulation and phonological alterations that may occur in speech.Children under three years and six months were informally asked to repeat a few words, using playful activities.
Visual reinforcement audiometry, which is used in children between six and 24 months (12) , was used to assess hearing in children up to two years, six months and 29 days of age, using a portable pediatric audiometer, with pure modulated tones (warble) at frequencies of 500, 1000, 2000 and 4000 Hz, in intensities from 20db to 80 dB presented in a free field.Frequencies were alternated (500, 1000, 2000, 4000 Hz) until reaching the minimum response level for each frequency.Responses between 20 and 40 dB were considered normal (13) .Although this assessment is suitable for children up to two years, it was necessary to extend the age range used, because it was found that children at that age were not able to respond to the playful audiometry.
Children between two years, seven months and five years, 11 months, 29 days underwent a hearing assessment, with conditioned play audiometry or pure tone audiometry (14) , using an Interacoustics Screening Audiometer AS208, properly calibrated.Hearing thresholds in air conduction were verified from 500 to 4000 Hz at 20 dB.If there was a failure in the responses, at one or more frequencies, and in two consecutive screens, the child was sent for a complete hearing and otorhinolaryngologic assessment.
Children with appropriate responses underwent the phonological and vocabulary assessments.This assessment was performed using spontaneous speech and naming of small objects and toys, from a pre-elaborated list, selected based on the Child Phonological Assessment (Avaliação Fonológica da Criança, AFC) (15) .This instrument allows an assessment of the possibilities of occurrence for each consonant of Brazilian Portuguese, in all possible positions.
Video recordings were made using a Samsung SMX-C200 camcorder and stored in an external HD for broad phonetic transcription of the childrens' speech and alphabetical transcription of the examiners' speech.For the phonetic transcription of speech of children up to 3 yrs, 3 mths, 29 days, the method of consensus was used (16) , i.e. two judges worked independently on the transcription.Then, the transcripts were compared and discrepancies were heard by a third evaluator until they reached an agreement for all utterances/words/sounds produced by the child.If there was no agreement between at least two evaluators, the passage was deleted, thus, assuring the reliability of the transcriptions, preventing the deletion of a large number of words, since young children, even with typical development, present greater variability in productions and articulatory immaturity.
For the children from the other age groups, which present more stable productions, the following method for reliability between transcriptions was used: all samples were transcribed by an experienced evaluator in child language and a second evaluator with the same experience transcribed, independently, 20% of the same sample for reliability (5,17) .There was a mean agreement of 79.6% at the 3-yr age groups; 81.9% for 4-yr groups; and 80.1% for 5-yr groups.
Filming lasted, on average, 20 minutes, in order to obtain both a representative sample of speech and to ensure the viability of the transcriptions, considering the large number of individuals.The reference for diagnosing phonological alterations were the studies conducted by Regina Lamprecht (1) , considering a margin of error, given the uniqueness of phonological acquisition.This reference was chosen since it is from the same state where the current research was conducted.
Contrastive Analysis was used initially for the speech assessments, which consists of four stages: Phonetic Description 1: a record of consonant sound execution; Phonetic description 2: a record of phonetic inventory based on categories of articulation and executions of consonant clusters; Contrastive analysis 1: a record of percentages of occurrences and possibilities of substitutions and omissions made by the child; and Contrastive Analysis 2: which presents the phonological system used by the child, including contrasts, substitutions and omissions (15) .
From this analysis, the following criteria were used to establish the phonological inventory (18) : occurrence from 0% to 39% indicates that the phoneme is not acquired; occurrence between 40% and 79% indicates that the phoneme is partially acquired; occurrence equal to or greater than 80% indicates that the phoneme is acquired.To determine the features of the phonological systems of the individuals and the probabilities of producing each sound, the general phonological system of each individual was considered, analyzing consonants /p, b, t, d, k, g, f, v, s, z, ʃ, Ʒ, m, n, ɲ, l, ʎ, r, R/ at positions of simple onset, /l, r/ at the second position of complex onset and /s, r/ at the coda position.It should be noted that lateral coda was produced in all cases as the semivowel [w], resulting in a diphthong (19) .Postvocalic nasal consonants were considered to behave like the nasalization of the preceding vowel, being considered, therefore, as a floating autosegment (20) .The nasal consonant may also become an element of the diphthong.Thus, nasal and postvocalic lateral consonants were not considered to be in the coda position.
After completion of the contrastive analyses, the number of phonemes acquired in the general phonological system of each child was counted for each IMFC group (2) -State 0 and Levels 1 and 2; Levels 3, 4 and 5; Levels 6, 7, 8 and 9 and in all possible syllable structures.
For the classification of vocabulary data, the number of lexical types produced was counted in accordance with a previous study (21) , using a simple sum of the number of words spoken by each child from the transcriptions.It is important to note that the words spoken in repetition (said immediately after the examiner said them) were not considered.
All the words spoken by each child were entered into a spreadsheet (one item per cell).Then, the repeated items were deleted, leaving only one item of each word.In this way, all the different words produced by the child were considered for the sum of the number of words spoken.
In cases where a word kept the same radical, but changed the suffix, e.g. for number, gender and verb inflection, it was considered a single lexical item (one spoken word).For example, boy/boys; like/liked; menino/menina (gender inflection in Portuguese).
In the case of verbal phrases, when two words were used together with a single meaning, as in "going to eat", this was also considered a single lexical item (one spoken word), since the first verb only contributes to express the time frame, but does not contribute to the meaning.Contractions, such as "pra" (para a), were recorded as a single lexical item (one spoken word).
The mean number of lexical types produced (mean number of words spoken) were then calculated in each age group.As mentioned previously, the different grammatical classes produced were not considered in this study.Since the objective was to analyze the quantity of lexical items produced (how many words the children produce) at a certain age, only the number of words spoken by each individual in each filming was counted in order to establish correlations between these data and that of consonant production.
The numerical data of lexical types (mean number of words spoken at the time determined by age group) and of the phonemes in the general phonological system in each syllable structure and in the different IMFC levels (2) produced by each individual were compared across age groups, using the Statistica 9.1 program and applying the Kruskal-Wallis non-parametric test, followed by multiple comparison tests.
The same program was used to calculate correlations between the number of lexical types produced and the variables established for the phonological system of the individuals in each age group surveyed, using the Spearman rank correlation coefficient, followed by the Student's t-test to check the significance of the correlation.Some correlation coefficients could not be calculated because the values remained constant.For both tests, a significance level of 5% was used (p ≤ 0.05).

RESULTS
Table 1 presents data for consonants acquired in the general phonological system and in each possible syllable structure by age, as well as the comparison between these means.
To understand this table, it is important to keep a few things in mind: the maximum number of possible phonemes in the general phonological system and the simple onset position is the same, which is, 19; the maximum number of possible phonemes for complex onset and for coda is two.
Table 2 presents the data for consonant production in each IMFC level (2) by age group and the comparison between these means.
Table 3 presents the mean number of lexical types produced by age group and the comparison between these means.
Table 4 presents the correlations between the number of lexical types produced and the number of phonemes acquired in the phonological system, at the different IMFC levels (2) and in each possible syllable structure in the age groups surveyed.Few correlations showed statistical significance and most of the correlations were positive, that is, with the increase in one of the variables, there was an increase in another.
The only significant negative correlation found was in the age group 5 yrs, 8 months-5 yrs, 11 months, 29 days between the lexical types produced and the number of phonemes acquired in the complex onset position.This means that the more lexical types children at this age produced, the more unstable the complex onset production was or, conversely, the fewer lexical types   these children produced, the higher the frequency of correct complex onset production.

DISCUSSION
There was a gradual increase in the mean number of consonants acquired in all possible positions (Table 1) and the phenomenon of regression, which has often been mentioned (1,22) was not observed.Regression is usually analyzed as a single phenomenon and is more easily observed in longitudinal studies (22) .We believe this phenomenon was not observed here due to having considered the means of acquisition of a large number of children and having counted only the number of consonants, without analyzing them separately.
Studies carried out in Rio Grande do Sul, the same state in which this survey was conducted, have suggested complete acquisition of the Brazilian Portuguese phonological system at between five years and five years and two months, with complex onset being the last component to be acquired (1) .However, in the present study, not all children presented a complete phonological system, even at the age of five.This may be due to either individual variations (1) or environmental interferences, since language is the result of the interrelation between the initial state (language acquisition system) and the course of experience (23) .
In relation to syllable structures, acquisition followed the sequence: simple onset → coda → complex onset, demonstrated by the statistically significant difference across age groups for each structure, as occurred in previous research (1) .This can be explained by the degree of articulatory difficulty in the production of syllable components, where consonant-vowel (CV) is the simplest structure, followed by C-V-C and then CCV.
With regard to IMFC levels (2) , quick stabilization and linear growth were observed in the consonants of 0 State and Levels 1 and 2, since they correspond to stops and nasal phonemes, which are acquired early in Brazilian Portuguese (1,19) .
The phonemes of Levels 3, 4 and 5 also showed linear growth, however later stabilization, in the range from four years to four years, three months and 29 days.These levels correspond to stops /k, g/ and fricatives /f, v, s, z/.According to the literature, the first of these can be acquired from one year and seven or eight months (1) up to two years or two years and one month (19) .The fricatives /f, v/ are acquired between one year and eight or nine months (1) up to two years or two years and one month (19) .Acquired a little later, the fricatives /s, z/ were found at between two years and two years and six months (1) up to two years and four months and two years and 11 months (19) .Generally speaking, the children in this study presented later acquisition, because only in the first four-year age group did 100% of the children present all the phonemes for Levels 3, 4 and 5.
For Levels 6, 7, 8 and 9, corresponding to the posterior fricatives and liquids, there was not 100% acquisition at any age group.However, at the age of four years to four years, three months and 29 days, most children had all these consonants acquired, which is consistent with the literature (1,19) .
In relation to the lexical types produced, in general there was a gradual increase, as was observed in other studies (24) .The significant increase in lexical types produced between the first age group and that of three years, four months and three years, seven months, 29 days was similar to that found in a German study with a similar methodology (24) .
There was a more pronounced increase in the means of lexical types between the age groups in the early stages of acquisition, similar to a study with French children, which found a marked progression in measures of lexical diversity between 24 and 36 months (25) .This may be related to the vocabulary explosion phase, a period of rapid growth in the number of lexical items produced by children (21) .
The constant and gradual increase of vocabulary occurs because it is the only language element with an infinite possibility of growth, that is, we acquire new words throughout our entire lives, even though the items used in our daily lives are predominantly acquired in the initial period of acquisition (21) .
Also in terms of number of lexical types produced, the standard deviation analysis showed great variability among children within the same age group.This occurs due to individual variations and external factors that may influence the acquisition of language, as already mentioned (23) .
As for the interrelations found, the only negative correlation found can be explained by studies that have found that children with large vocabularies have more complex phonological systems than those with small vocabularies.A large vocabulary may generate a demand for a more advanced phonological system (6) .Likewise, changes in vocabulary in children with language changes are due to difficulties in information processing,  which involves the phonological and semantic representations corresponding to a new lexical item (4) .
One study (26) that compared performance on an expressive vocabulary test in children with and without phonological deviation showed that the correct naming of items (nouns) on the test was significantly higher in children with typical phonological development.This indicates that the better the phonology, the better the vocabulary will be in general, which is also in corroboration with our results.
Words with phonetic properties that mirror pre-linguistic vocalizations will be acquired sooner than words with features or syllable structures that are not present in the child's pre-linguistic repertoire (6) .This explains why most of the correlations found in the initial age groups were related to phonemes in the simple onset position and to the first levels of IMFC (2) .
Most of the positive correlations found in this study were related to coda.A study of four-year-old American children showed that rhyme recognition is directly related to incidental acquisition of new words (27) .Therefore, since coda structures come at the end of the syllable and may also be at the end of the word, it is assumed that children with better rhyming recognition skills will more easily recognize and produce them, which also influences vocabulary expansion.
The only statistically significant negative correlation was found between the production of lexical types and the production of phonemes in the complex onset position in the last age group.One possible explanation for this is that, in this age group, children have a more diverse vocabulary with more difficult grammatical classes and a greater semantic requirement may lead to more errors in phonological production (28) .In addition, this phonological difficulty is more pronounced in the more complex syllable structure and in later acquisition.Moreover, some consonant clusters take longer to be acquired, even if they are high frequency words in the child's input (29) .
Thus, it was confirmed that vocabulary and phonology present a similar behavior during typical language development and there are interrelations between them, most of which are positive correlations.It is important to note that these data corroborate a number of international studies (6,(27)(28)(29) , demonstrating that there is a relationship between lexicon and phonology that is common to different languages.

CONCLUSION
In typical language development, the phonological system grows gradually with the child's advancing age, both for the number of consonants acquired in the general phonological system and the different syllable structures related to IMFC levels (2) .In terms of syllable structures, the order of acquisition was simple onset, followed by coda and complex onset.Neither coda nor complex onset presented 100% acquisition, even in the oldest age groups surveyed.In terms of IMFC levels (2) , the order of acquisition was: 0 State and Level 1 and 2 phonemes → Levels 3, 4 and 5 → Levels 6, 7, 8 and 9.The latter group did not reach 100% acquisition in any age group.
The behavior of lexical types produced was similar to that of consonants, e.g.there was a gradual progression.However, in the first five-year group there was a small regression, although without statistical significance.
Correlations between vocabulary and phonology were found and most of these were positive, indicating that these systems are interdependent, that is, the more lexical types the child produces, the better his or her phonological system will be in general.Only complex onset presented a statistically significant negative correlation, which was in respect to lexical types in the oldest age group surveyed, indicating that the corresponding syllable structure may require more linguistic processing, negatively impacting vocabulary.

Table 2 .
2escriptive measures of the number of consonants in each IMFC level2

Table 3 .
Descriptive measures of the number of lexical types produced by age group

Table 4 .
Correlations between the number of lexical types produced and different phonological aspects of individuals Caption: Student's t-test for significance of the Spearman's rank correlation coefficient; level of significance: p ≤ 0.05; for some variables it was not possible to calculate the correlation coefficient; GPS = General Phonological System; S0 = 0 State; L = Level; Asterisk (*) indicates values with significance difference