Cortical auditory evoked potentials using the speech stimulus /ma/

Purpose: to compare cortical auditory evoked responses using two speech stimuli, /ma/ and /da/, in normally hearing young adults. Methods: a cross-sectional, observational and analytical study, with a sample composed of nineteen normally hearing young adults, recruited by convenience, ages between 18 and 25 years old, from both genders, participated in the study. Cortical auditory evoked potentials (CAEP) were monaurally recorded in two conditions: 1) with a pair of speech stimuli /ba/ and /da/, and 2), with a pair of speech stimuli /ba/ and /ma/. The order of the experiments was randomized in a proportion of 50% for each of the two stimuli, totaling 100 stimuli for each experiment. Speech sounds were presented at 70 dB SPL. Descriptive and analytical statistical tests were performed. Results: mean latency values of the complex P1, N1, P2, N2 and P3 were lower for the /ma/ when compared to those of /da/ (p <0,05). There was no difference in amplitude values between responses evoked using /ma/ and /da/. Conclusion: cortical auditory evoked potentials, elicited by the speech stimulus /ma/ had, on average, lower latency peaks of P1-N1-P2-N2 and P3, when compared to those of speech stimulus /da/.


INTRODUCTION
Auditory Evoked Potentials (AEP) are electrophysiological responses evoked by a sound and characterized by changes in electrical activity along the auditory pathway 1 . Cortical auditory evoked potentials (CAEP) are represented by positive (P) and negative (N) peaks. Peaks P1, N1 and P2 are mostly exogenous potentials, and N2 is considered a mixed peak. Their latencies are, respectively, between 60 to 80 ms, 90 to 100 ms, 100 to 160 ms and 180 to 200 ms 2 . Another cortical response is a positive peak that occurs between 220 ms to 280 ms, called P3. This peak is mostly originated at the frontal or frontal-central lobes 3 and is related to an initial sensory process. It is also related to attention to new stimuli 4 .
Cortical responses are evoked by several types of stimuli, such as clicks, pure tones, or speech sounds. Speech sounds have different temporal and spectral parameters that are used as contrasts to evoke cortical responses. Usually two syllables that are different at the voice onset and/or the speech articulation point of a specific sound are used, as /ta/ and /da/ or /ba/ and /da/ 5 .
The speech syllables used to evoked cortical responses are usually composed by a consonant and a vowel, for example /da/, /ta/ and /ba/. The consonant is briefer and evokes a transient response. The vowel evokes a sustained response, called frequency followed response (FFR) 6 . The syllable /da/ is the most frequently stimuli used in studies on CAEP (e.g., Kraus and Nicol, 2005 7  The study 10 have investigated cortical responses of children with learning disabilities and reported that only the speech stimuli was able to identify learning problems because it induces a more complex decoding process. Therefore, researchers have been searching for new speech stimuli, which may be sensitive in detecting speech and hearing problems 8 .
The syllable /ma/ may be especially interesting regarding its learning context in early childhood. According to the linguistic generative theory, the way language is acquired by a young child is universal. One of the first phoneme experienced by babies, natives of many different languages, is the /ma/. This sound is frequently heard in childhood 11 . In addition to that, another language acquisition theory, called the emergent theory, says that repetition of a phoneme in early childhood contributes to its consolidation in memory 12 .
Considering that different neural regions are activated with speech sounds 2 and the fact that speech perception is the most relevant social function of the auditory system, studying CAEP, using a new stimulus, such as /ma/, will contribute to understand how the auditory system processes speech sounds, and possibly help in diagnosing auditory and/or language problems. Thus, the present study aimed to compare the cortical auditory evoked responses elicited by the speech stimuli /ma/ and /da/, in normally hearing young adults. Nineteen (19) male and female young adults, aging between 18 and 25 years old, were randomly recruited and have accepted to participate.

This research protocol is based on
All participants presented: hearing thresholds below or equal to 25 dB HL from 250Hz to 8000Hz, including the inter-octaves 3000 and 6000 Hz; Tympanograms type "A" and presence of ipsilateral and contralateral acoustic reflexes; absolute and inter-peak values of Auditory Brainstem Responses (ABR) latencies within normality for the click stimulus; and Montreal Cognitive Assessment (MoCA) equal to or greater than 26 points. Participants with auditory processing and/or cognitive complaints, and history of middle or outer ear infections were excluded from the study.

Data collection procedures
After signing a consent form, participants were submitted to a first-step procedure including inspection of the external auditory canal; application of the MoCA test to rule out the possibility of slight cognitive deficiencies; immittanciometry, tonal and vocal audiometry, and click ABR.
Posteriorly, CAEP were recorded in two conditions: 1) with a pair of speech stimuli /ba/ and /da/, and 2) with a pair of speech stimuli /ba/ and /ma/. The order of the experiments was randomized and the procedures are described below.
For CAEP experiments, the participant remained in an acoustically treated booth, sitting comfortably in a reclining chair. The subject was asked to watch a movie with subtitles on a tablet, in the silent mode. The equipment model used was Opti-Amp 8008 from Intelligent Hearing Systems (IHS). Electrodes had a concave gold-plated contact area, and were placed on Cz/A1-2 (vertex/right-left earlobe) and ground electrode on Fpz (forehead). The stimuli were monoaural, presented only to the right ear, in a random proportion of 50% for each of the two stimuli, totalizing 100 stimuli for each experiment. Responses were registered in a window of 500 milliseconds, with band pass filter

Data analysis
Data were tabulated and processed by the Statistical Package for the Social Sciences (SPSS -version 23.0). Tabular and graphical presentations, means, standard deviations, and the hypothesis tests were used to analyze the data.
After characterization of the obtained data through descriptive statistical techniques, Kolmogorov-Smirnov testing was applied to check the normality of the distributions of the variables. Student's t-test for paired data was also used to compare the differences between the responses evoked by the proposed stimuli, in the case of variables with normal distribution. Values were considered significant when p < 0.05.

RESULTS
The sample consisted of 19 participants, in which 13 (68%) were females. All participants aged from 18 to The normality of the samples, regardless of sex, was checked using the Kolmogorov-Smirnov test, and the results were homogeneous and normal. Thus, the parametric Student's t-test was used in the analyses for the paired comparisons. Figure 3 shows the P1-N1-P2-N2-P3 complexes, comparing /ba/(1) (test performed with /ba/ and / da/) and /ba/(2) (test performed with /ba/ and /ma/). In Figure 4, the same complex can be observed, but evoked by the stimuli /da/ and /ma/. The results express the mean values found and their respective standard deviations. Figure 3, there were no significant differences (p>0.05) in the latencies and amplitudes between /ba/(1) and /ba/(2), per analyzed peak (P1, N1, P2, N2 and P3).

As shown in
It is observed in Figure 4 that, on average, all latencies of the peaks (P1, N1, P2, N2 and P3) were significantly lower, when responses were evoked by Peak latency (ms); amplitude (uV). There were no statistical differences between the stimuli  . Latency and amplitude of the P1-N1-P2-N2-P3 Complex evoked with /da/ and /ma/ the /ma/ stimulus. It was also shown that there were no significant differences in amplitude between the stimuli / da/ and /ma/. The p-values for each of the comparisons can be seen in Table 1 and Table 2. Figure 5 shows the grand average of the cortical auditory evoked potentials elicited by the phonemes / da/ and /ma/.

Discussion of methods
During the sample selection process, five incomplete recordings were excluded due to the need for paired data. This need is related to the fact that cortical sensory processing, even for identical stimuli, has a large variability among normal subjects 13 and it can be influenced by gender and age 14 .
Participants who failed to complete all the stimuli of the CAEP test in a single session were eliminated due to the possibility of changes in the P2 component. Ross and Tremblay (2009) 15 and Tremblay et al. (2014) 16 suggest that mere exposure to a stimulus during baseline EEG recording sessions, even in the absence of training, could contribute to increased P2 amplitude. For this reason, the pairs were also randomized.
In order to compare the two stimuli, /da/ and /ma/, /ba/ was chosen as the second control stimulus, in order to evaluate the behavior of the two phonemes of interest under conditions of equal interactions. It was seen that the phoneme /ba/ presented in both situations (/ba/ (1) and /da/ and /ba/ (2) and /ma/) did not   Caption: P1 = first positive peak; N1 = first negative peak; P2 = second positive peak; N2 = second negative peak; P3 = third positive peak lead to significantly different results, only in the stimuli of interest (/da/ and /ma/) (Table 1 and 2), indicating that the testing conditions were the same and the differences were physiological. For the CAEP assessment, the chosen proportion of presentation of stimuli was 50% for each, out of a total of 100. Stimuli with equal frequency rates suggest a better visibility for the individual characteristics of each stimulus, without an attention effect directed to one of the presented stimuli (rare), as in the traditional oddball paradigm 17 .

Discussion of the results
The latency values of the P1, N1, P2, N2 and P3 CAEP components found here were similar to the values reported in literature: P1 between 54-75 ms, N1 between 80 and 150 ms, P2 between 145 and 200 ms, N2 between 180 to 250 ms 18 and P3 between 220 to 350 ms 19 . However, the amplitude values found here were smaller than what it is reported in literature. There are report some factors that may influence this variation of amplitude, such as body temperature, time of the day, food intake shortly before the examination, seasons, and even personality factors 20 .
Time perception of the stimulus can be observed in latency recordings 21 . It was seen here that the stimulus /ma/ presented a shorter perception time, revealed by the CAEP latency responses that were, in average, lower, when compared to the /da/ stimulus, for all components.
In order to better explain the results, linguistics aspects related to the stimuli need to be understood. For example, the generative theory explains that speech is considered a sequence of a set of distinctive features, and the phonological processes involved in its acquisition are motivated by acoustic perception 22 . According to this approach, the existence of an innate mechanism responsible for the acquisition of language, denominated Universal Grammar, is presupposed. This mechanism is responsible for guiding the process of acquisition of language in children, through its interaction with the linguistic environment in which they are inserted 23 .
Within this generative context, Clements (2009) 24 proposed principles that determine the constitution of linguistic systems, such as the scale of robustness, which reveals that there is a universal hierarchy of features where the contrasts of higher features are acquired earlier than the lower contrasts. In analysis, the labial feature /m/ belongs to the top of the robustness scale, as one of the most robust contrasting features, while the + -voice feature /d/ occupies a lower position in the hierarchy, as less robust 25 . Thus, /m/ is learned first than /d/ and for this reason it is a more heard and trained sound.
The example was used of the /ma/ phoneme, in most languages, this phoneme most often appears as one of the earliest acquired by infants. In many languages, the name representing the mother usually has the /m/ phoneme, making it easier for the babies to say it 11 . Furthermore, the syllable is reinforced along early childhood in a repeated way 12 .
Repeatedly introduced auditory stimuli can affect how sound is processed in the brain of the listener and thus, modify the auditory evoked responses 15 . Accordingly, studies 26 showed that the capacity for cortical discrimination early in childhood is increased by simple passive sound exposure.
It was seen, in the present study, that the P3 component evoked by /ma/ presented lower latency when compared to /da/. This result may be related to the processing of the acoustic characteristics of the sound of /ma/, early learned and kept in memory, after the comparison of the received stimulus with the previously stored neural representation 27 .
In fact, one of the functions of the working memory is to compare the "new" information that is arriving in our brain by the sensory (auditory) pathways with old information, which is consolidated and stored in long-term memory, acquired since childhood 28 .
The latency of P3 increases according to the difficulty of discrimination of the stimulus 29 . This indicates that the phoneme /ma/ demonstrated better discrimination, due to its lower latency. The event may be associated with the facility of identifying "familiar" sounds present in memory. This was also observed in the lower latency of N2, whose latency value showed the same positive correlation with the difficulty of discriminating the speech contrast of P3 30 . It is known that the N2 component is mixed, linked to the processing of identification and attention to the stimulus.
The other component evoked by the CAEP responses are the P1-N1-P2 wave complex. In the results of the study, the mean latencies of /ma/ were smaller when compared to /da/, as occurred with N2 and P3. P1 is the first positive peak of the complex; it is believed that it reflects the control of the auditory information passed on to the auditory cortex. N1 reveals the detection of acoustic changes and P2 demonstrates auditory processing beyond sensation 31 .
The results of the present study showed that the / ma/ and /da/ stimuli were acoustically processed in different ways. The perception of the consonants occurs through transient acoustic events, which can be separately perceived 32 , that is to say, the acoustic analysis occurs meticulously according to the distinct characteristics of each phoneme.
The study 33 reported that the P1-N1-P2 complex reflect the neural representation of perceptually relevant temporal clues, such as changes in voice start time. Thus, when evoked by different stimuli, as in the present study, the P1-N1-P2 wave complex reacts in a very distinct way, indicating this complex is highly dependent on the physical properties of the stimulus that is used to evoke it.
Speech-elicited CAEP researches are especially interesting because speech perception is the most important social function of the auditory system 34 . Clinical applications with speech stimuli have already been proposed for various uses, such for acoustic verification of amplification in hearing aids 35 , and in the results of auditory training 36 . Thus, the discovery of more sensitive responses to stimuli, which do not require the active participation of the investigated subject, calls for further studies on the phoneme /ma/ in other populations such as children, and using new components of CAEP.

CONCLUSION
Cortical auditory evoked potentials elicited by the speech stimulus /ma/ had, on average, lower latency peaks of P1-N1-P2-N2 and P3, when compared to those of speech stimulus /da/.