Word final prolongations: acoustic characteristics and influence on speech fluency perception

ABSTRACT The aim of this study was to characterize word final prolongations in individuals with and without stuttering as well as to investigate the influence of end of words prolongations on speech fluency perception. In Experiment 1, 14 subjects were submitted to speech fluency evaluation for analysis of duration and average frequency of extended phones at the end of words. In Experiment 2, twenty lay judges were asked to judge the fluency degree of utterances without disfluency, utterances containing prolongations and utterances containing filled pauses. In experiment 1 the groups differed only in the duration’s variation; in both groups the prolongations in monosyllabic words prevailed; 80% of the prolonged phones in both groups were vowels. In experiment 2 no significant difference was found in the comparison between the judgements of prolongations and filled pauses. The utterances without disfluency differentiated themselves significantly from the others. We suggest that characteristics such as position in the word, duration and physical concomitants should be considered before deeming prolongations as a stuttering-like disfluency.


ABSTRACT
The aim of this study was to characterize word final prolongations in individuals with and without stuttering as well as to investigate the influence of end of words prolongations on speech fluency perception.In Experiment 1, 14 subjects were submitted to speech fluency evaluation for analysis of duration and average frequency of extended phones at the end of words.In Experiment 2, twenty lay judges were asked to judge the fluency degree of utterances without disfluency, utterances containing prolongations and utterances containing filled pauses.In experiment 1 the groups differed only in the duration's variation; in both groups the prolongations in monosyllabic words prevailed; 80% of the prolonged phones in both groups were vowels.In experiment 2 no significant difference was found in the comparison between the judgements of prolongations and filled pauses.The utterances without disfluency differentiated themselves significantly from the others.We suggest that characteristics such as position in the word, duration and physical concomitants should be considered before deeming prolongations as a stuttering-like disfluency.
Keywords: stuttering; speech acoustics; speech; language and hearing sciences.

Introduction
Speech fluency can be characterized as a liveliness of the speech flow stream, recourse to speed, rhythm, smoothness and effort for speech production (ASHA, 1993).In continuity, the influence of decisions on the linguistic planning and formulation of speech processes (Scarpa & Fernandez-Svatsman, 2012).Speech fluency manifests in a different manner on each speaker and can depend on factors such as previous knowledge of the subject (Felsenfeld et al., 2000).According to the Psychiatry Association (Coleman, 2013), stuttering is characterized as a disorder related to a disturbance on motor speech production that has an impact on fluency, it may cause disfluencies such as syllable and sound repetition, blocks and sound prolongation and those may occur in vowels and consonants.
Disfluency is the interruption of the speech continuous flow.At first, the simple presence of disfluency is not a speech disorder indicator considering that speech rarely occurs without it.Speech produced by people who stutter (PWS) and people who do not stutter (PWDS) frequently contains elements such as pauses, words repetition and prolongations (Macgregor, Corley, & Donaldson, 2009).Disfluencies do not have the same syntactic function (Tree, 1995) and the functional and structural connectivity of the brain when disfluency occurs is not the same in people who stutter and in people who do not stutter (Sitek et al. 2016).
There are two elements related to disfluency that seems to differentiate people who stutter from people who do not stutter: frequency and disfluency type.Yairi (1997) presented the term "stuttering-like disfluency" or disfluencies typically observed in developmental stuttering.As to frequency, it is necessary to have at least 3% of "stuttering-like disfluencies" for a speaker to be consider as a PWS.Though stuttering-like disfluencies are important to stuttering diagnosis, they also occur in PWDS (Juste & Andrade, 2006).
Prolongation is a stuttering-like disfluency (Costa, Ritto, Juste & Andrade, 2017) and it can be characterized as a sound (consonant or vowel) spoken with a larger duration than normally expected (Johnson, 1961), causing an inappropriate elocution duration, and it may be accompanied by pitch elevation (Yaruss, 1998).In the literature related to linguistics, the prolongation is called elongation or filled pauses (Betz & Kosmala, 2019;Gold, Ross, & Earnshaw, 2018;Defracq & Plevoets, 2018 ) and it is defined as a conversational marker (Betz & Kosmala, 2019;Bellinghausen, Fangmeier, Schröder, & Keller, 2019) until the utterance is complete (Betz & Kosmala 2019;Gósy, 2019), showing the end-of-shift indication, attention and confirmation of the listener (Gósy, 2019), related or not to difficulties that the speaker has in planning and formulating what is said.
There is evidence that disfluencies affect the understanding of listeners (Macgregor et al. 2009;Dejoy & Jordan, 1988;Tree, 2001;Corley, Macgregor & Donaldson, 2007;Macgregor, Corley & Donaldson, 2010).Among the disfluencies reported in the studies, filled pauses (along with silent pauses and prolongations) marked hesitations on of the speakers' part.Like facial expressions and tone of voice, hesitant disfluencies provide information to the listener.This information can refer to problems in the production by the speaker or to the next content of the message itself (Corley & Stewart, 2008).
Prognostic tools for stuttering use, among other criteria, the presence of prolongations and blocks, as well as prolongations longer than one second, as risk factors for stuttering chronicity (Tumanova et al. 2011).Boey, Wuyts, Van de Heyning, De Bodt and Heylen (2007) found that children with stuttering were significantly more likely to have prolongations and blocks.Juste and Andrade (2010) investigated the occurrence of prolongations and their location within the word in individuals with and without stuttering, in which it was observed that individuals who stutter had higher occurrences within the words and the fluent ones in the phoneme of the final syllable of the words.
In a study on the influence of gender and level of schooling on fluent adults' speech fluency (Andrade & Martins-Reis, 2011), there were more occurrences of sound prolongation in subjects of higher education.As of common disfluencies, interjections and revisions were found in adults schooled up to primary education.The authors concluded that this seems to be a flexible strategy of the linguistic processing component in replacing a filled pause by prolongations in higher education individuals.For the authors, the form that individuals with higher education use to resume speech and language processing is the prolongation, which only appeared at the end of the words and can be considered, in terms of the analysis of the conversation, as a marker of hesitation.They also consider that it can be an indication of syntactic or semantic-lexical errors as a strategy in the production of speech (Postma & Kolk, 1993).Roberts, Melter and Wilding (2009) evaluated the effect of speech sample lengthening and correlated levels of fluency in monologues of fluent adult males, in which prolongation was present in the speech of half the subjects participating in the research and many of them occurred at the end or at the beginning of the word, thus emphasizing prolongation as an exercised ability to emphasize something.The same authors related the site of rupture of the prolongation with the places in which an interjection would normally occur.
In a study by Juste and Andrade (2011) word prolongation and rupture site were analysed in the speech of stuttering and fluent individuals, the results showed that individuals with stuttering have a greater occurrence of prolongation in the syllable nucleus position.In syllables of Brazilian Portuguese, the nucleus is always filled by a vowel (Câmara, 1976;Castro & Wertzner, 2009;Souza, 1998).
In the quantitative and qualitative analysis of the speech prolongation of subjects with and without fluency disorder (Silva et  al. 2016), individuals with disorders had a higher occurrence of nonhesitant prolongations at the beginning of words and isolated words, modifying the lexical unit, corroborating the study by Juste and Andrade (2010).In fluent individuals, there was a higher frequency of hesitant prolongations at the beginning and end of the word, which may be associated with a hesitative mark.The hesitant prolongations tended to occur in monosyllabic words or in unstressed final syllables.
From the perspective of perception studies of such phenomena, Corley et al. (2007) point out that there are several corpus analyses and behavioural studies that suggest that disfluencies can affect listeners.A long-term consequence of disfluency is that the speaker is classified as less likely to know answers to questions of general knowledge when their answers are preceded by filled pauses (Brennan & Williams, 1995).For MacGregor et al. (2009), filled pauses and hesitations, as well as repeated or prolonged words, normally occur when the speaker is uncertain about how to continue and are part of the linguistic input that a listener must interpret.Bailey and Ferreira, (2003) have shown that filled pauses can affect the interpretation of syntactically ambiguous sentences by listeners.Hearing a phrase that has a filled pause increases the memorization of the subsequent word (Corley et al. 2007), possibly due to an increase in attention (Collard, Corley, MacGregor, & Donaldson, 2008).et al. (2010), concluded that disruptive pauses may increase listeners' expectations for a lexical item that is harder for the speaker to produce.The authors emphasized that the disfluency statements of their study included features of disfluency, such as prolongation, before the interruption itself, and therefore the effects cannot be attributed solely to the presence of the silent pause.

MacGregor
The role of the listener and his/her perception of the disfluencies presented by the speaker is considered as a criterion for the evaluation of speech in stuttering.This may occur, for example, through judges who have no relation to the speaker or the therapist, consisting of "naive listeners" (Huinck, & Rietveld, 2007).Lay judges are apt and able to distinguish even the attitudes expressed by people who are fluent over those expressed by people who stutter (Celeste, 2010).
The literature review presented here points to at least two types of prolongations, hesitant (which can be classified as a common disfluency) and non-hesitant (considered a stuttering-like disfluency).It is questioned whether word final prolongations would have hesitant characteristics and, in this case, would be similar between people who stutter and people who do not stutter.It is believed that the acoustic characterization of such prolongations can be an important analysis for such an investigation.
As the literature indicates that fluent individuals tend to present vowel prolongation at the end of words, the present study aimed to (1) characterize the end of words prolongations regarding the type of prolonged sound, variation of the duration of the prolonged sound and variation of the average of prolonged sound frequency in individuals with and without developmental stuttering.The present study also aims to (2) verify lay judges' speech fluency perception as to the degree of fluency of prolongations in rhyme position at end of words and ( 2) compare the influence of the prolongations in rhyme position in end of words and filled pauses in speech fluency perception.The study may represent one more step toward establishing subtypes of prolongation which will contribute to the accuracy of the diagnoses of stuttering and other fluency disorders.

Methods
This Research paper was submitted and approved by The Research Ethics Committees under CAAE 0308.0.203.000-11 and opinion number 122/09.All participants in the survey were volunteers and signed a Consent Form.

Experiment 1: acoustic analysis
Participants: 14 adults were recruited: six of them with a speechlanguage pathologist' diagnosis of stuttering (stutterers -PWS) and eight fluent speakers (people who do not stutter -PWDS).The inclusion criteria common to both groups was: to be between 18 and 29 years old and to be a native speaker of Brazilian Portuguese.The specific inclusion criteria for the PWS group were that they had to have a previous diagnosis of stuttering and be on the waiting list for consultation at a Disfluency Clinic of any Brazilian University Hospital.Exclusion criteria for both groups were: history of neurological and/ or psychiatric illness; history of associated communication disorders; poor quality audio recordings and individuals that did not present sound prolongations at the end of words.Initially there was no distinction of sex or age, but only males met both the criteria of inclusion and exclusion of the research.
Speech Fluency Profile analysis: all participants were assessed using a Brazilian standard Fluency Assessment Protocol (Andrade, 2000) and Riley's Stuttering Severity Instrument (SSI-3) (Riley, 1994) to establish if there was or was not evidence of developmental stuttering.All speech samples were recorded using a tripod-mounted digital camcorder.To enable subsequent acoustic analysis, the speech samples were also recorded using a digital voice recorder coupled to a unidirectional microphone.Spontaneous speech samples were transcribed (fluent and disfluent syllables) to: a) classify the disfluencies (common disfluency: hesitation, interjection, revision, unfinished word, word repetition, segment repetition, repetition of phrase, and stuttered disfluencies: syllable repetition, sound repetition, blocking, pauses, intrusion of sound or segment and prolongation); b) calculate speech rate (words and syllables per minute).To guarantee data reliability, transcripts and analysis of the Fluency Assessment Protocol (Andrade, 2000) were submitted for inter-agreement analysis.The transcriptions and analysis were performed by two team members, with at least a 90% agreement rate.
Prolongations' acoustic analysis: Data was transferred from the recorder to a computer and then the acoustic analysis was performed using the software Praat version 5.3.83(Boersma, & Weenick, 2017).To label the data, an interval tier was created to mark the beginning and the end of prolongations that appeared at the end of words.Syllable was considered according to Selkirk (1982) that suppose a structure composed by onset and rhyme; the rhyme is composed by one nucleus and one coda.The nucleus is it the only component that cannot be empty.In Brazilian Portuguese the nucleus is always occupied by a vowel or a diphthong (Bisol, 1989).To obtain reference values of duration and F0 of the prolongations, after labelling, non-prolonged sounds were selected from words with the same phonetic context.The duration average and fundamental frequency measures of the prolonged and referenced sounds were calculated.Duration is the time between the beginning and the end of the prolonged sound and as it was measured in seconds.For fundamental frequency measures the F0 mean was extracted for each prolonged sound.That means that for each analysed sound we had a referee from the same person in a similar phonetic context.To analyse the data, the variation of each of these measures was considered, according to the formula below: Duration variation = prolonged sound duration -reference sound duration F0 mean variation = F0 mean of prolonged sound -F0 mean of reference sound Statistical analysis: An Excel spreadsheet database was created containing the characteristics of all prolongations identified: number of syllables and sounds of the word with prolongation(s), type of prolonged sound, duration and F0 mean of the prolonged and reference sound.For the statistical analysis, measures of central tendency and dispersion were calculated for the continuous variables and frequency for the categories.Most of the variables did not present normal distribution; therefore, the comparisons were made by Pearson's chi-square, Fisher's exact and Mann-Whitney's.The level of significance was 5%.

Experiment 2: perceptual test
For the perceptual test, we collected 25 utterances produced by 39 fluent adults which consisted of 20 males and 19 females with an average age of 22.2 years old.Ten utterances contained prolongations in rhyme position at end of words, ten utterances contained filled hesitation and five utterances had no disfluencies.Phrases containing other concomitant disfluencies were discarded.The classification of disfluencies followed a methodology established by Andrade (2000) and the speech samples were edited using Praat 5.3.83software (Boersma & Weenick, 2017).Judges: Twenty participants (10 males and 10 females with an average age of 24 years old) were selected for the perceptual test.The inclusion criteria for this group were that they did not have any speech fluency disorders and should be older than 18 years old.Speechlanguage pathologists or speech-language pathology students were excluded, as well as participants who did not met the other inclusion criteria.None of the judges reported any hearing problems.
Procedures: The perceptual test had to be answered on a test sheet which contained numbers from 1 to 25 in a vertical column, which corresponded to the speech sample, and a score rating line numbered from 0 to 5, which indicated the grading of speech fluency (where 0 represented excess disfluency and 5 represented a high level of fluency).To standardize the data for analytical purposes, the following degrees of fluency were considered: from 0-1, bad speech; of 2-3, moderate speech; and 4-5, good speech.The participants were asked to mark the column corresponding to the utterance line according to their judgments of fluency for each speech sample.The utterances were arranged randomly in the test.Each utterance was presented twice and could be repeated up to three times if there were any cases of doubt.The application of the perceptive test was carried out collectively in the presence of all twenty judges.Data analysis: data was submitted to descriptive statistical analysis (frequency table) and inferential analysis (One and Two Sample Proportion Test), with a significance level of 5%.

Results
Regarding the acoustic analysis, in the PWS group there were 20 prolongations at end of words and in the PWDS group there were 29.There was no difference between groups regarding the number of syllables in the words with prolongations (Pearson's chi-square test), with predominance of monosyllable words (Table 1).In Table 2 it was not possible to verify if there was a difference between the groups in the number of sounds of the word with prolongations, due to the number of cells with an expected value less than 1.0, but the Mann-Whitney test did not find statistically significant differences between the groups in relation to the average number of sounds of the words with prolongations (p = 0.107).Table 3 presents the characterization of the groups regarding the distribution of the prolonged sounds.It was not possible to calculate the chi-square due to the number of cells with a value lower than 5.0, but there was a greater frequency of prolongations in the phone ɪ_ in the PWS group and ɐ_ in the group PWDS.In Tables 4 and 5 the frequency of prolongation can be observed taking into account the type of prolonged phone, and no difference was observed between the groups.In both groups more than 80% of prolonged sounds were vowels.It was not possible to calculate the chi-square due to the number of cells with a value lower than 5. Regarding the acoustic aspects of duration and F0 mean of prolongations, the groups differed only in terms of the variation of duration of the prolonged sounds, which was higher for the PWDS group, this showed that the fluent participants demonstrated a longer duration of the prolongations (Table 6).For the perceptual test, the descriptive analysis of the data indicated that in the study sample the sentences with filled pauses/hesitations and prolongations were classified mainly as moderate speech, whereas sentences without disfluencies were predominantly classified as good speech (Table 7).To analyse if there was a significant difference between the assessments of the different statements, a One Sample Proportion Test was carried out comparing the proportions found.The p-value> 39.3 2023 0.05 indicated that there was no significant difference between the proportions evaluated.The results of the Proportion Test can be seen in Table 8.In the comparison between the proportions of responses of the statements containing prolongation and the statements containing filled pauses/hesitations, there was no significant difference.In the comparison between statements with prolongation and statements without disfluency (fluent speech), the Proportion Test showed statistical differences.The Proportion Test also showed different proportions in the comparison between statements containing filled pauses/hesitations and statements without disfluency.

Discussion
The study, through experiment 1, sought to characterize word final prolongations regarding the type of prolonged sound, variation of duration and mean frequency of the prolonged sound of subjects who stutter and who do not stutter.In general, there were no difference between the two groups regarding characteristics of the word final prolongations.Prolongations were most frequent in monosyllabic words and vowels (Tables 1, 4 and 5).The groups differed only in terms of the variation of duration of the prolonged sounds (Table 6).
The fact that fluent participants demonstrated a longer duration of the prolongations may be explained by previous study that observed that people who stutter tend to be less accurate and more variable during the production of speech (Boutsen, Brutten & Watts, 2000).
We also investigated the influence of the presence of prolongations on a) general perception of fluency; b) degrees of fluency according to type of speech, and c) comparisons between hesitation and speech without disfluency.The results of this experiment did not indicate a difference between word final prolongations and hesitation regarding the degree of fluency (Tables 7 and 8).It must also be emphasized that the utterances containing prolongation and hesitation were judged to be worse than statements without disfluencies (Table 8).This result seems to indicate that the presence of disfluencies in speech, typical of stuttering or not, is enough to worsen the perceived degree of fluency of a statement.
The occurrence of word final prolongations in the speech of both individuals with and without stuttering corroborates with previous research, especially regarding studies with different language variants of Brazilian Portuguese (Pinto et al. 2013;Costa et al. 2017;Natke et al. 2006;Campbell, & Hill, 1998;Souza et al. 2013;Tumanova et al. 2011;Andrade, 2004) and fluent trait (Silva et al. 2016;Nogueira et al. 2015;Celeste, & Reis, 2013;Roberts et al. 2009;Oliveira, Bernardes, Broglio, & Capellini, 2010).Juste and Andrade (2011) analysed the influence of word size and break point of the syllables in the speech of adolescents and adults for both groups (stutterers and non-stutterers).The results showed no influence of word size on the number of disfluencies with predominance of prolongations in monosyllabic words for the two groups studied, as with the present research.These combined results show that prolongations of monosyllabic words do not appear to be a trait of stuttering, but a linguistic marker of hesitation (Betz & Kosmala, 2019;Alvar, Lee, & Huber, 2019;Gold et al. 2018;Defrancq & Plevoets, 2018;Bellinghausen et al. 2019;Castro et al. 2014).
In a study with German pre-school children with and without fluency disorders, prolongation occurred in both groups, but was more frequent in stutterers (Natke et al. 2006).This was repeated in a study carried out with adults who spoke Brazilian Portuguese (Pinto et al. 2013), but these studies did not indicate the place of rupture (the breaking point).However, another study with fluent adults who spoke Brazilian Portuguese showed that the prolongation was the most frequent stuttering-like disfluency and only occurred at the end of words (Andrade & Martins-Reis, 2011).
In a study carried out by Moniz (2006), the acceptability of the end of word prolongation was as good as that of the filled pauses, and in some cases even greater acceptance.This seems to suggest that prolongation at the end of speech is a disfluency common to all speakers, fluent or stutterers.One should also consider the possibility that prolongation is not only a disfluency but also a component of discourse capable of exercising its own function in statement (Macgregor et al. 2009;Eklund, 2001;Howell, 2007;Schnadt, & Corley, 2006).
Regarding the prolongation position, the occurrence of this disfluency at the beginning of the word is related to lower naturalness of speech, while its positioning at the end of the word indicates a greater naturalness of speech (Tables 7 and 8).In a study by Roberts et al. (2009) the prolongations, often described as a stuttering symptom and not typical of "normal" speech, were produced by 11 of the 25 subjects.Many of the prolongations occurred in words at the end or beginning of a sentence in places where an interjection would normally occur -these prolongations were not accompanied by tension and were relatively brief.The authors recommend caution in classifying all prolongations in adults as part of their stuttering.Juste and Andrade (2010) found that the prolongations observed in individuals without stuttering speech occurred exclusively in the last phoneme of the last syllable of the word.These prolongations, for the authors, seem to have the same purpose of hesitation -they are strategies used to facilitate the co-articulation between words.
Studies have shown that in the speech of individuals with fluency disorders (such as stuttering and Tourrete Syndrome), the prolongations are more frequent within the word (Silva et al. 2016;Nogueira et al. 2015;Van Borsel, Goethals, & Vanryckegham, 2004), thus highlighting the existence of a rupture (break) of the lexical unit (Silva et al. 2016;Juste, & Andrade, 2006), which is a characteristic of non-hesitant prolongations.As the aim of the present study was to characterize prolongations at the end of words, the prolongations that appear to be typical of stuttering (within words) were not accounted for.In this sense, we cannot say that the groups did not differentiate as to the general occurrence of prolongations.Juste and Andrade (2011) found more prolongations in the speech of individuals with without stuttering in the nucleus position, corroborating the present study in which a greater occurrence of prolongations in vowels than in consonants was observed.In Brazilian Portuguese the vowel is obligatorily the nucleus of the syllable (Câmara Junior, 1976;Castro & Wertzner, 2009).It can also be seen that in Brazilian Portuguese there is a predominance of words that end with a vowel, being used as a marker of hesitation: the speaker takes advantage of the longest word to correct lexical-semantic processing failures (Eu comi banana versus Eu comi éh banana).Nogueira et al. (2015) pointed out that the prolongations in fluent speakers appeared shorter when compared with individuals who stuttered but did not extract the measures for comparison purposes.
In the present study, on the other hand, in which acoustic parameters were used for analysis, it was found that individuals without fluency disorder presented a greater variation in the duration of prolonged sounds than stutterers, and no difference was seen between the groups of the variation of the F0 mean.Juste and Andrade (2010) studied the influence of the tonicity and location of the rupture (break) and saw that in fluent individuals, prolongation is present in the final syllable of words.They also pointed out that the acoustic characteristics of prolonged sounds do not seem to be typical of stuttering.This statement seems to be confirmed by the present study, as prolongations at the end of words were more frequent in the PWS group than in the PWDS group (Tables 1-6) and were judged by lay listeners to be as fluent as the filled hesitations (Tables 7 and 8) thus supporting the hypothesis that prolongations at the end of words function as hesitations, therefore should not be counted as stuttering-like disfluencies.

Conclusions
It was seen that the occurrence of prolongation at the end of words was not influenced by the length of the word in terms of syllables and phones, with predominance of prolongations in monosyllabic words for the PWS and PWDS groups.There were more occurrences of vowel prolongations than consonants.Concerning the acoustic aspects, nonstutterers present a greater variation in the duration of prolonged sounds than stutterers.The groups did not differ in the F0 mean.
Prolongations ending in rhyming position at the end of words were judged as common disfluency.The findings of the present study, as well as other studies presented here, indicate that prolongations can be classified as both a stuttering-like and common disfluency, depending on the duration, position in the word and presence of physical concomitants.Consideration of such characteristics in the classification of prolongation as a disfluency may contribute to an increase in the diagnostic precision of fluency disorders.

Table 1 -
Distribution of prolongations according to the number of syllables in the word with prolongations

Table 2 -
Distribution of prolongations according to the number of sounds in a word with prolongations

Table 3 -
Distribution of prolongations according to the prolonged phone

Table 4 -
Distribution of prolongations according to the classification of prolonged sound.

Table 5 -
Frequency of prolongations in vowels and consonants.

Table 6 -
Comparison between the groups of the variation of prolonged sounds duration and of mean frequency.

Table 7 -
Perceptual test responses

Table 8 -
Comparison of the degrees of fluency within sentences and comparison of the statements within the degrees of fluency