Acoustic analysis of speech intonation pattern of individuals with Autism Spectrum Disorders

Purpose: This study aimed to analyze prosodic elements of speech segments of students with Autism Spectrum Disorders (ASD) and compare with the control group, using an acoustic analysis. Methods: Speech recordings were performed with a sample of individuals with ASD (n = 19) and with typical development (n = 19) of the male gender, age range: 8-33 years. The prosody questionnaire ALIB (Brazilian Linguistic Atlas) was used as script, which contains interrogative, affirmative and imperative sentences. Data were analyzed using the PRAAT software and forwarded to statistical analysis in order to verify possible significant differences between the two groups studied in each prosodic parameter evaluated (fundamental frequency, intensity and duration) and its respective variables. Results: There were significant differences for the variables tessitura, melodic amplitude of tonic vowel, melodic amplitude of pretonic vowel, maximum intensity, minimum intensity, tonic vowel duration, pretonic vowel duration and phrase duration. Conclusion: Individuals with ASD present significant differences in prosody compared to those with typical development. It is noteworthy, however, the necessity of additional studies on the characterization of prosodic aspects of speech of individuals with ASD with a larger sample and a more restricted age group.


INTRODUCTION
The Autistic Spectrum Disorder (ASD) has been described by DSM-5 to encompass four previously separate disorders (autistic disorder, Asperger's Disorder, Childhood Disintegrative Disorder and Invasive Developmental Disorders not otherwise specified) in a single condition with varying degrees of severity and symptomatology (1) .
Thus, with the publication of DSM-5 (1) , the Autism Spectrum Disorders received a marker of "severity" based on the degree of impairment.Therefore, in the DSM-5 diagnostic criteria, three severity ratings are observed: Level 1 (requiring support), Level 2 (requiring substantial support), and Level 3 (requiring very substantial support).These classifications are divided into two areas: Social Communication (SC) and Restricted and Repetitive Behaviors (RRB), characterizing the main symptoms of ASD (2) .
Prosody changes in the speech of people with Autism Spectrum Disorders have been reported in the literature since the initial description of this condition (3) .However, to date, little is known about the prosody's perception of these individuals or about the specific aspects of their prosodic productions which result into perception of strangeness by the interlocutor (4) .
The prosodic aspects observed and described since the beginning of ASD classification, include: monotonic or robotic speech, deficits in the use of pitch (frequency) or volume control (intensity), deficiencies in vocal quality and use of peculiar stress patterns (4) .Especially speakers with Asperger's Syndrome or High-functioning Autism are reported to present such difficulties (5,6) .It should be mentioned that these prosodic differences are persistent and present little change over time, even when other aspects of language are improved (4) .
Literature reports some inconsistency regarding the specific prosody mechanisms of ASD individuals as well as some uncertainty concerning its possibility of assisting on the diagnostic criteria, which are currently uncertain (7) .Only the abnormal timbre, intonation, velocity, and rhythm or emphasis (eg, tone of voice may be monotonous or may be raised interrogatively at the end of affirmative phrases) are mentioned as characteristics in the DSM-IV diagnostic criteria (8) .
Some studies also refer to poor prosody and monotony in intonation as qualitative aspects of production and marked characteristics in the speech of individuals with ASD, based on the diagnostic criteria previously mentioned (9,10) .Klin (2006), in turn, adds that individuals with ASD generally exhibit a restricted spectrum of intonation patterns, presenting little relation in the communicative functioning of the statement (e.g., assertions of fact, humorous comments) (11) .
In this sense, prosody can be seen as a suprasegment spanning two aspects: production, characterized by three classic parameters: duration (difference between two events), fundamental frequency and intensity; and perception, characterized by notions of perceived duration, height and volume (12) .
In view of the above, the objective of the present study was to analyze prosodic elements of speech segments of individuals with autism spectrum disorder (ASD) and to compare with the control group, through an acoustic analysis.

Ethical aspects
All aspects of this study were approved by the Research Ethics Committee of the Universidade Estadual Paulista (n o .0763/2013).Data collection continued by acceptance in participating in the research and signature of the consent term, according to resolution of the National Health Council CNS 466/12.

Casuistic
Two groups participated in this study: EG = Experimental Group composed of children, adolescents and adults with ASD (n = 19, male gender, age between eight and 33 years) and CG = Control Group composed of individuals with typical development (n = 19, matched by gender and age).
All participants were selected in regular and special schools in a municipality located in the interior of São Paulo -Brazil, according to a list of students with ASD provided by the Municipal Department of Education.Participants in the control group were selected in the same schools, following the inclusion criteria.Chart 1 summarizes the information of the individuals in the experimental group.

Inclusion criteria
The inclusion criteria of the experimental group were: medical report of Autism Spectrum Disorder (previous diagnosis of Asperger's Syndrome) and adequate linguistic performance for recording the speech samples.As inclusion criterion, absence of speech-language complaint was the requirement for the control group.

Identification form
Parents or guardians of all participants were questioned regarding personal identification data, according to a questionnaire specifically designed for this research.This questionnaire included questions, such as: name, date of birth, age, gender, schooling / school grades and period of therapy (exclusively for participants in the experimental group).

Scale of Autistic Traits (SAT)
Considering that the SAT is not considered a diagnostic interview, but rather a standardized test that gives the behavioral profile, based on the different diagnostic aspects (13) , this instrument was used in the present study in order to characterize the symptomatology presented by the participants with ASD.It is important to mention that the SAT is administered after detailed information of the clinical data and can be applied since two years of age.
The scale is composed by 23 subscales, each one is divided into different items.Its scoring is based on the following criteria: each subscale of the test provides a value of 0 to 2; it is scored zero if no item is present, 1 if there is only one item, and 2 if there is more than one item.Then, the arithmetic sum is obtained, with a cut-off value of 23, which indicates that the person may present some degree of Autism Spectrum Disorder.
Prosody Questionnaire -Brazilian Linguistic Atlas (ALiB) For data collection of the speech samples, the prosody questionnaire of the ALiB (Annex A), which is composed of 11 questions related to interrogative, affirmative and imperative phrases, was applied.This questionnaire enables the collection of a semi-directed speech sample, since, through instructions given to the inquirer, it becomes possible to obtain certain answers.
The questionnaire is succinct and objective, focusing on the two sentence modalities with distinctive value in Portuguese (assertive and interrogative patterns) and three basic patterns of expressive intonation: dislike, contentment and order.

Data collection procedure
Initially, the parents and the participants were interviewed in a room previously reserved specifically for this purpose, at the institution where the study was conducted.Subsequently, the SAT was applied to the responsible ones, and the speech-language pathology screening to the participants.
From the application of SAT with the responsible for the participants, it was possible to verify that all the participants obtained a cut above 23, as expected, indicative of Autism Spectrum Disorder, with a mean score of 33, with the lowest score being 27 and the highest equal to 40.The procedures were performed with the supervision of a psychiatrist, a speech therapist and a psychologist, and diagnoses had already been established by a specialized multidisciplinary team.
In relation to the speech-language pathology screening, it is pertinent to highlight that were evaluated skills such as: speech chain, comprehension of simple and complex orders and vocal quality.To identify the presence of linked speech ability, participants were initially asked about their hobbies and activities during the week; to evaluate the ability to understand simple and complex orders, participants were asked to perform two or more actions.Finally, to verify the presence or absence of vocal alterations, the participants were asked to pronounce the vowels a, i and u on a long-term basis.All participants presented adequate skills for participating in the study.
Thus, from the initial results compatible with those mentioned in the inclusion criteria, the participants were referred to the acoustic booth for the speech samples recording procedure.It becomes important to note that prior to the completion of all procedures, the research was thoroughly explained to the responsible ones, and they expressed their consent to participate by signing of the free and informed consent form, according to the resolution CNS 466/12.
In reference to the acoustic booth, where the recordings were realized, it becomes important to mention that such a booth is installed in the institution and acoustically isolated, providing For the recordings, the participants were seated comfortably at a distance of approximately six centimeters from the microphone.Thus, they were instructed to remain attentive to the elicitations of speech provided by the researcher, who followed the ALiB prosody questionnaire as script, in order to produce the target phrases appropriately according to the interrogative, affirmative or imperative pattern.
For elicitation of the target phrases contained in the ALiB questionnaire in its item "Expected response", the researcher read the questions contained in the questionnaire itself (Annex A).Regarding this aspect, it is pertinent to detail that, considering the final objective is a sample of speech close to spontaneous, on the part of the participants, we opted for neutral reading of the statements (without facial expression or prosodic variations) performed by the researcher.Thus, although it is not characterized as an objective form of elicitation, the situation was as close as possible to a dialogue with the participants, constituting, in this way, a modality of semi-directed discourse.
It should be mentioned that, in some cases, a pre-training was necessary, in order to understand the task.For this training, a question of the questionnaire was randomly selected, as an example for what should be done during the recording, for clarifying doubts.This way, the collection was not biased and the instructions became evident to all the participating individuals.
Data obtained with the recordings were saved on a computer for acoustic analysis using PRAAT software.

Data analysis procedure
The 11 sentences (expected responses) obtained with the recordings were analyzed through the software.To do so, the audio files were extracted from the recorder, organized into a database and analyzed using PRAAT software version 5.3.60.All the sentences were analyzed through 3 parameters offered by the software: fundamental frequency, intensity and duration.
In relation to the fundamental frequency parameter (F0), the following measures were analyzed: -Initial F0 and final F0 of the utterances: extracted from the center of the first vowel (initial F0) and last vowel (final F0) of the sentences.
-Maximum F0 and minimum F0 of the utterances: extracted the maximum and minimum values of F0 of sentences.
-Tessitura of the utterances: difference between the maximum F0 and minimum F0 of the sentences.
-Melodic amplitude (MA) of the salient and pretonic tonic vowel separately: difference between the maximum and minimum values of F0 in the tonic and pretonic syllables.
-Melodic variation velocity rate (MVVR) of the salient and pretonic tonic vowel: difference between the maximum and minimum values of the tonic, divided by the duration of the tonic vowel.The same procedure was adopted for measurement of the pretonic.
Regarding the intensity parameter, the measures were analyzed as specified below: -Maximum and minimum intensity of the utterances.
-Difference between the maximum and minimum intensity values of the utterances.
Finally, in relation to the duration parameter, the following measures were analyzed: -Duration of the salient tonic vowels.
-Duration of the pretonic vowels.
-Duration of the sentences.
For analyzing the recordings, were initially extracted the values of each of the 15 variables analyzed in the 11 sentences produced by the participants.From these values, the average value of the variables for each participant was calculated.These values were submitted to statistical analysis to verify possible statistically significant differences.

Statistical analysis
For the statistical analysis, the Mann-Whitney Test was used, in order to verify possible statistically significant differences between the two groups studied in each prosodic parameter evaluated (fundamental frequency, intensity and duration) and their respective variables.The statistical package IBM SPSS (Statistical Package for Social Sciences), version 21.0, was used to obtain the results.
The significance level of 5% (0.050) was used for the statistical test, that is, when the value of the calculated significance (p) was lower than 5% (0.050), there was a 'statistically significant difference', that is, there was an 'effective difference'.When the calculated significance (p) value was equal to or higher than 5% (0.050), a 'statistically non-significant difference' was found, that is, there was 'similarity'.

RESULTS
From the analysis of the speech samples performed using the PRAAT software, it was possible to obtain the absolute values of each of the three parameters studied (fundamental frequency (F0), intensity and duration) and their specific variables, namely: initial F0 and final F0 of the utterance; maximum F0 and minimum F0 of the utterance; Tessitura of the utterance (T); Melodic amplitude (MA) of the salient tonic vowel and pretonic separately; Melodic variation velocity rate (MVVR) of the salient and pretonic tonic vowel; Maximum and minimum intensity of the utterance; Difference between the values of maximum and minimum intensity of the utterance; Duration of the protruding tonic vowels, the pre-tonic vowels and the utterance.
For better visualization of the results, the average values obtained with the production of the 11 utterances of the ALiB questionnaire were extracted, as shown in Tables 1, 2 and 3.
Figures 1, 2 and 3 show the mean values of the three acoustic parameters studied and their respective variables for the experimental and control groups.
Using the Mann-Whitney Test, it was possible to verify statistically significant differences in the comparison of values between the groups studied, for the variables: Tessitura, Tonic vowel melodic amplitude, pretonic vowel melodic amplitude, Maximum intensity, Minimum intensity, Duration of the tonic vowel, duration of the pretonic vowel and Duration of the utterance.

DISCUSSION
The accomplishment of this research enabled to reach the intended objective of analyzing prosodic elements of speech segments of individuals diagnosed with Autism Spectrum Disorder (ASD) and to compare with individuals of the control group, through an acoustic analysis.It is important to mention, however, that the study sample includes participants with a wide age group (from eight to 33 years), characterizing a limitation in the study, not allowing to present, in this way, a generalization of the data found.It is pertinent to highlight that this situation is originated from the difficulty in locating people diagnosed with ASD in the region of this study.Regarding the analyzed variable "period of therapy", it should be noted that three individuals (9, 15 and 17) who mentioned to have been attending therapy for a period equal to, or greater than 10 years, did not present large differences in the values analyzed when compared to the other participants with ASD.This fact should be indicative that such aspects might not be adequately addressed in the therapeutic context.Regarding the symptomatological characteristics obtained through SAT scores, no differences were observed which could indicate that the degree of the characteristics presented by the individuals might be related to the prosody changes presented in the speech.
In relation to the fundamental frequency prosodic parameter, statistically significant differences were observed for the variables tessitura of the utterance, melodic amplitude of the salient and pretonic tonic vowel, melodic variation velocity rate of the salient tonic and pretonic vowel.These data do not confirm statements of the literature regarding a monotonous speech without melodic variation, considering the higher values of these variables in relation to the control group (4,10,11) .
No studies were found which evaluated the variable tessitura of individuals with ASD.It is known that tessitura variations can shift the fundamental frequency patterns to more serious or more acute levels, without changing the typical shape of the melody curve.Considering that changes in the tessitura play an expressive function, linked to the discursive intentions of the speaker; (focalization, introduction, prolongation and closure) also enable extra linguistic information, such as: identity, gender, age, psychic aspect, personality, geographic and cultural origins of the individual, the higher values of tessitura of the experimental group in relation to the control group may be a peculiar characteristic of the speech of the individuals with ASD (14) .
Also regarding the fundamental frequency parameter, no statistically significant differences were observed for the variables initial fundamental frequency, final fundamental frequency, maximum fundamental frequency and minimum fundamental frequency of the utterance, although it is possible to verify that these values are predominantly higher in the experimental group, in relation to those obtained with the control group.This information partially corroborates a study which mentions that there is no statistically significant difference in the fundamental frequency parameter for the preschool group, but mentions a significant difference in the school-age group (15) .A possible explanation for greater variability observed in the fundamental frequency parameter of the experimental group in relation to the control group, may be the deficit presented by individuals with ASD in the mechanisms controlling the pitch.This deficit can be derived from a problem at the reception level, at the production level, or at the relationship between the two, which leads to inadequate production, in relation to what is expected from the discourse (16) .Regarding the prosodic intensity parameter, statistically significant differences were observed in the group with ASD, concerning the values of the control group for the variables maximum intensity and minimum intensity of the utterance.These data corroborate with statements in the literature, pointing out that people with ASD appear to have poor voice volume control (e.g., the voice is very loud, despite the physical proximity of the conversation partner) (4,11) .
The intensity of speech of the individual should vary according to factors of linguistic relevance: form of communication (speech, crying, shouting, moaning, etc.); paralinguistic factors: voice tone; and extra linguistic factors: distance of the participants and the physical and social site where the conversation is taking place.Thus, the deficits presented by individuals with ASD in the perception of these factors may explain the greater variability of intensity, in relation to the individuals in the control group (14) .
Regarding the duration parameter, statistically significant differences were observed for the three measures studied: duration of tonic vowel, duration of pretonic vowel and duration of utterance.Therefore, a slowed speech pattern of the ASD group was verified, characterized by a longer duration in seconds of the same utterance produced by the control group individuals.These findings corroborate the study by Diehl  and Paul (2013), stating that individuals with ASD present a longer term speech, in relation to individuals with typical development (17) .
The longer duration in seconds observed in the production of the utterances and also of the tonic and pretonic vowels, already observed in other studies, can be explained by the motor deficit frequently presented by individuals with ASD.Another hypothesis also considered, constitutes of an erroneous perception of that prosody parameter, which could lead to deviant production (17,18) .
Two main types of prosody are mentioned in the literature: affective and linguistic.The first involves non-verbal aspects of the language, which are necessary to transmit and recognize the emotions in communication, enabling differences in expression and comprehension of happiness, sadness, anger, etc.In this way, the intonation pattern that accompanies the statement suggests the speaker's emotional state (19) .
Yet, the linguistic prosody acts at the phonological and syntactic levels and enables individuals to express the specific meaning of a statement, giving emphasis to parts of words and phrases, thus transmitting an affirmative, interrogative or affirmative message.Thus, individuals with processing alterations may present important difficulties in the production of vocal chants, indicative of emotions, as well as understanding them, which causes an important impairment of social interaction through communication (20) .

FINAL CONSIDERATIONS
It was possible to observe that the participants of the present study diagnosed with Autism Spectrum Disorder presented a speech characterized by greater variation in the parameter of fundamental frequency (tessitura) throughout the utterance, greater melodic amplitude of tonic and pretonic vowel, greater intensity variation, so speech was louder and lower than individuals with typical development, and slower speech regarding the duration of the utterance and the tonic and pre-vowel vowels.
Based on the data and discussions presented here, the relevance of speech-language pathology in the assessment and intervention process related to the linguistic and communicative skills of individuals diagnosed with Autism Spectrum Disorder, especially in the suprasegmental aspects of speech (prosody) and its variations, in relation to normality.Thus, in view of the fact that prosody is an important aspect in the pragmatic use of language, it should be emphasized that, either for the evaluation process as for the intervention, the variables interfering in this aspect must be carefully observed.
Therefore, this research reiterates the importance of studies on the characterization of the prosodic aspects of speech of individuals with Autism Spectrum Disorder, considering the necessity of analyzing a sample with a more restricted age group and with larger amount of participants, in order to guarantee more reliability on the results found.

Caption:Figure 1 .
Figure 1.Mean values of the groups for the acoustic parameter "fundamental frequency" Significant value (p <0.05) -Mann-Whitney test Caption: *Refers to the significant difference in p <0.05; EG = experimental group; CG = control group

Figure 2 .Figure 3 .
Figure 2. Mean values of the groups for the acoustic parameter "intensity"

Chart 1 .
Characteristics of the participants of the ASD group Caption: SA = Asperger Syndrome; TEA = Autistic Spectrum Disorder; HFA = High-functioning Autism; SAT = Scale of Autistic Traits; Ed = Education high fidelity equipment inside: digital recorder MARANTZ, model PMD 660 coupled to a Sennheiser dynamic cardioid microphone model e815s.

Table 1 .
Mean values of the utterances for the acoustic parameter "fundamental frequency"

Table 2 .
Mean values of the utterances for the acoustic parameter "intensity" Caption: E = Experimental; C = Control

Table 3 .
Mean values of the utterances for the acoustic parameter "duration"