Acessibilidade / Reportar erro

The long-term average spectrum in research and in the clinical practice of speech therapists

Abstracts

BACKGROUND: one of the great difficulties in evaluating a voice is the judgment of quality through the perceptual auditive analysis - although frequently used -, as it is influenced by socioeconomic and cultural aspects as well as individual preferences. Many are the adjectives and methods used in this assessment, especially because of the subjectivity involved in the process, leading to incompatibilities between listeners and difficulties in reaching a consensus on the use of this or that terminology. In such a context, the voice laboratory and more specifically the acoustic computerized analysis, has guided and complemented speech-language treatments. Among the several possibilities of spectrographic analysis, the (Long-Term Average Spectrum - LTAS) quantifies the quality of voices, pointing differences between gender, age, professional - spoken and sang - and dysphonic voices. The LTAS has been used a lot in researches that investigate voice. As it evidences the contribution of the glottic source and of resonance to the quality of voice, it provides objective parameters for the evaluation of this aspect which usually depends on our auditive perception. AIM: to demonstrate how LTAS can be applied in voice research and in the speech-language therapy practice, describing both the technical aspects required for the production and interpretation of results, and its limitations. CONCLUSION: the area of voice research has developed a lot in these last two decades especially because of the advent of the voice and speech laboratory. For this reason, the knowledge about the applicability of more tools for voice analysis, as the LTAS, as well as the existing need for more studies in this area, will most certainly contribute for the creation of new research areas not only in the field of professional voice but also in the field of therapy.

Voice Quality; Voice Training; Acoustical of the Speech; Speech Perception


TEMA: uma das maiores dificuldades que encontramos ao avaliar uma voz é julgar a sua qualidade por meio da análise perceptivo-auditiva que - ainda que soberana - envolve desde aspectos sócio-econômicos e culturais até preferências individuais. Muitos são os adjetivos usados nesta avaliação e os métodos empregados, pela subjetividade envolvida neste processo, acabam gerando discordâncias entre os ouvintes e dificuldades de assumir um consenso em torno do uso desta ou daquela terminologia. Neste contexto, o laboratório de voz e, mais especificamente, a análise acústica computadorizada, trouxe a possibilidade de orientar e complementar a conduta fonoaudiológica. Entre as várias possibilidades de análise espectrográfica, o espectro médio de longo termo (Long-Term Average Spectrum - LTAS) oferece a possibilidade de "quantificar" a qualidade de uma voz, marcando as diferenças entre gênero, idade, vozes profissionais - falada e cantada - e vozes disfônicas. O LTAS vem sendo muito utilizado em pesquisas na área de voz pois, ao evidenciar a contribuição da fonte glótica e da ressonância para a sua qualidade, fornece subsídios objetivos para a avaliação deste parâmetro que depende basicamente da nossa percepção auditiva. OBJETIVO: trazer o conhecimento sobre a aplicação do LTAS na pesquisa e na clínica fonoaudiológica, descrevendo tanto os aspectos técnicos necessários à sua execução e à interpretação dos seus resultados, bem como as limitações no seu uso. CONCLUSÃO: a área de voz tem se desenvolvido muito nestas duas últimas décadas graças ao advento do laboratório de voz e fala. Assim sendo, conhecer a aplicabilidade de mais uma ferramenta de análise, o LTAS, considerando ainda a demanda existente de estudos nesta área, certamente vai contribuir para a criação de novas linhas de pesquisa tanto em voz profissional quanto na reeducação de alterações vocais.

Acústica da Fala; Espectro Médio de Longo Termo; Qualidade de Voz; Treinamento da Voz


UPDATE ARTICLES

The Long-term average spectrum in research and in the clinical practice of speech therapists* * Trabalho Realizado na Universidade Federal de São Paulo – Escola Paulista de Medicina.

Suely MasterI; Noemi De BiaseII; Vanessa PedrosaIII; Brasília Maria ChiariIV

IFonoaudióloga. Doutora em Ciências pela Universidade Federal de São Paulo - Escola Paulista de Medicina. Professora Assistente Doutora do Departamento de Artes Cênicas da Universidade Estadual Paulista de Júlio Mesquita Filho

IIMédica. Doutora em Medicina pela Universidade Federal de São Paulo - Escola Paulista de Medicina. Professora Associada do Departamento de Fundamentos da Fonoaudiologia da Pontificia Universidade Católica de São Paulo

IIIFonoaudióloga. Mestranda pela Universidade Federal de São Paulo - Escola Paulista de Medicina

IVFonoaudióloga. Professora Titular do Departamento de Fonoaudiologia da Universidade Federal de São Paulo-Escola Paulista de Medicina

Endereço para correspondência Endereço para correspondência: Suely Master Rua Dom Luís Lasagna, 400 São Paulo - SP - CEP 04266-030 ( smaster@ia.unesp.br)

ABSTRACT

BACKGROUND: one of the great difficulties in evaluating a voice is the judgment of quality through the perceptual auditive analysis - although frequently used -, as it is influenced by socioeconomic and cultural aspects as well as individual preferences. Many are the adjectives and methods used in this assessment, especially because of the subjectivity involved in the process, leading to incompatibilities between listeners and difficulties in reaching a consensus on the use of this or that terminology. In such a context, the voice laboratory and more specifically the acoustic computerized analysis, has guided and complemented speech-language treatments. Among the several possibilities of spectrographic analysis, the (Long-Term Average Spectrum - LTAS) quantifies the quality of voices, pointing differences between gender, age, professional - spoken and sang - and dysphonic voices. The LTAS has been used a lot in researches that investigate voice. As it evidences the contribution of the glottic source and of resonance to the quality of voice, it provides objective parameters for the evaluation of this aspect which usually depends on our auditive perception.

AIM: to demonstrate how LTAS can be applied in voice research and in the speech-language therapy practice, describing both the technical aspects required for the production and interpretation of results, and its limitations.

CONCLUSION: the area of voice research has developed a lot in these last two decades especially because of the advent of the voice and speech laboratory. For this reason, the knowledge about the applicability of more tools for voice analysis, as the LTAS, as well as the existing need for more studies in this area, will most certainly contribute for the creation of new research areas not only in the field of professional voice but also in the field of therapy.

Key Words: Voice Quality; Voice Training; Acoustical of the Speech; Speech Perception.

Introduction

One of the greatest difficulties found while evaluating voice is the judgment of its quality through a perceptual analysis, that although being the gold standard, involves socioeconomic and cultural aspects as well individual preferences (Biemans, 2002; Medrado et al., 2005; Bele, 2005). There are several adjectives used in the perceptual evaluation, and the methods applied, due to the natural subjectivity of this process, lead to disagreements among the listeners and to difficulties to reach a consensus on the use of this or that terminology (Bele, 2002). In such context, the acoustic analysis has made possible to guide and complement the speech-language treatment with more objective data.

Within the several possibilities of spectrographic analysis, the Long-term average spectrum (LTAS) offers the possibility of quantifying the quality of a voice, pointing the differences between gender, age, professional voices - talked and sang - and dysphonic voices (Leino, 1993; Mendoza et al., 1996; Navarro, 2000; Barrichelo et al., 2001; Hartl et al., 2001; Linville & Rens, 2001; Bele, 2002; Camargo, 2002; Sjölander, 2003; Jónsdottir et al., 2003; Hartl at al., 2003; Laukkanen et al., 2004; Camargo et al., 2004; Pinczower & Oates, 2005; Soyama et al., 2005).

Some particular features of an emission are more stable, such as the voice quality, and become more evident in extended speech samples; this is precisely one of the greatest advantages of using the LTAS (Camargo, 2002). Another advantage is that if the acoustic signal is long enough, the resulting mean spectrum is not affected by differences in the speech sample - subject matters and articulation - indicating a certain degree of reliability in the comparison between speakers and between studies (Frokjaer-Jensen & Prytz, 1976, Kitzing, 1986, Löfqvist, 1986).

The aim of this study is to describe the LTAS application and data interpretation, considering the acoustic events, the auditory perception and the phonation physiology, from the study of full texts collected on the database MEDLINE published between the years 1976 and 2005, with special regards to the last 5 years. While providing objective measures of voice quality, the LTAS is an excellent working tool that complements either the evaluation as the voice treatment follow up - therapeutic or pedagogic - contributing for advances in the few scientific study of voice carried out with this method.

Long term average spectrum

In the study of a sound, there are many possibilities of acoustic analysis. The most used ones describe the sound through its waveform and spectrum. According to Sundberg (1987), the spectrum shows the frequencies of the signal partials and the intensity, and is the acoustic correlate of voice quality. For the author, "there are important properties of the glottal source spectrum that can only be observed in decibels spectrum. It is the case of the higher partials amplitude that despite being small is extremely important for the timbre perception".

According to Nordemberg and Sundberg (2003), the LTAS particularly "reflects the contribution of the glottal source and the vocal tract for the voice quality". It places the average of several spectra obtained, for example, every 200 milliseconds (5 spectra/second, 300 in 1 minute) in only one spectrum. In the abscissa coordinate it shows the sound pressure level in decibels, and in the ordinate, the frequency in Hertz. The time is excluded from the long term spectrum analysis and, therefore, all variables related to it, such as frequency and amplitude (jitter, shimmer, harmonic/ noise ratio),are not captured unless they actually interfere on the glottal source (Frokjaer-Jensen & Prytz, 1976, Kitzing, 1986, Löfqvist, 1986). Thus the LTAS has to be complemented, many times, with other types of acoustic analyses and above all, hearing the voice is essential for the results interpretation. The reproducibility of the experiment must be considered since the voice of the same subject may be differently presented in different moments. For example, a normal voice at the end of a working day may be more breathy or more tense, or may not present evidence of dysphonia after a resting night.

According to Hammarberg et al. (1986), Sundberg (1987) and Leino (1993),the peaks or the regions of higher energy concentration of the LTAS are strongly related to the perception of different voice qualities.

Some methodological considerations must be observed in order to analyze the LTAS.

Duration of the sample

For Kitzing (1986) and Löfqvist (1986), if the signal to be analyzed lasts long enough, from 20' - 40', the resulting mean spectrum won't be strongly affected by differences in the speech material, such as accent, articulation pattern and other particularities inherent to each individual's emission. That's because the frequencies of the first formants - F1 and F2 - that have a greater variation between the vowels, become represented by an average, evidencing the formants with less variable values - F3, F4 and F5 - that are related to the voice quality (Sundberg, 1987).

Excluding the devoiced sounds, pauses and silence from the analysis

In order to study the contribution of the glottal source for the voice quality, it is necessary to eliminate the devoiced sounds from the speech material, once they are generated by a noise source and may mask the information from the voice source (Linville & Rens, 2001). For Löfqvist (1986), the same speech sample analyzed with and without pauses, with and without devoiced sounds, affects the spectrum specially at the frequency range of 5-8kHz. In the study of a professional voice quality where the most important information is concentrated in the frequency range of 8KHz, not excluding the devoiced sounds does not directly interfere on the evaluation; however, for the analysis of dysphonic voices, it is necessary to exclude this noise source interference in the spectrum.

Ways to measure the LTAS

The parameters used to measure the LTAS consider "the time when the spectrum energy is integrated and the frequency range where the energy is measured" (Navarro, 2000; Pinho & Camargo, 2001; Camargo et al., 2004). Nevertheless, there is no normative index or standard ways to measure the LTA spectrum, which in a way hinders a little the comparison between studies.

In general, we observe that having an indication of the curve's inclination by calculating the relation between the strongest and weakest regions of the spectrum has been the measure adopted by several authors (Frokjaer-Jensen & Prytz, 1976; Kitzing, 1986; Hammarberg et al., 1986; Pinczower & Oates, 2005). This calculation may be done by the manual measuring of the peaks, in decibels, or even automatically by acoustic analysis programs that provide the average of sound pressure level (Leq- equivalent sound level) of the total emission and/or of the frequency range. The inclination of the spectral curve was directly related to the voice quality: strong resonant voices present less difference between the strong and weak regions of the spectrum, while poor fluid voices present more differences (Hammarberg et al., 1986, Leino 1993, Bele 2002).

The peaks of the LTA spectrum correspond to the extension of the fundamental frequency variation (f0) - difficult to identify - and of the formants (F), and should be measured. The lower region of the spectrum, from 100-1KHz, has greater sound energy concentration than the other regions and is related to the mean sound pressure level of an emission and with the vocal loudness (Nordemberg & Sundberg, 2003; Laukkanen, Syrja, Laitala & Leino, 2004). Thus, the difference between peaks at 1-5 kHz and at 5-8 kHz and this stronger region of the spectrum may be calculated.

Measuring the difference between the f0 and F1 amplitudes (L1-L0) also provides information about the phonation mode (Sundberg, 1987). An f0 amplitude stronger than a F1 indicates a more fluid, breathy or weak intensity voice, while a F1 stronger than a f0 indicates a more tense, vocal chords more adducted or a strong intensity voice (Frokjaer-Jensen & Prytz, 1976; Kitzing, 1986; Hammarberg et al., 1986, Bele, 2002). Usually the F1 amplitude is higher than the f0. Figure 1 shows the scheme of the different spectrum region extensions corresponding to f0 and to the formant frequencies from which the above parameters are measured.


Tanner et al. (2005), searching to establish LTAS indexes to direct the evaluation of a therapeutic process, observed that there is a strong relation between the average and the standard deviation of the measures obtained in several spectra of the same individual with functional dysphonia, before and after therapeutic intervention, and the perception of voice improvement. For the authors these distribution measures would be possible markers of vocal quality improvement.

Spectrum normalization

Aiming to facilitate the measurement and the comparison between spectra, we suggest their normalization, which means placing the strongest component of the spectrum in 0dB, and the other components in negative values in dB. Some programs offer this possibility while others, such as Praat, offer a script that must be ran.

Effect of the intensity in the LTA spectrum

According to Nordemberg and Sundberg (2003), researches involving sound pressure level measures and the patient's voice recording need to be carefully monitored, so we don't take precipitated conclusions, since the frequency response is not linear for a same intensity increase. The authors point that a gain in high frequencies is greater than in low frequencies and, thus, the region up to 0,5kHz will be less affected than the 2-4kHz, for example. Therefore, comparing data produced in different degrees of intensity can be questioned, but in order to minimize this interference the speech sign recording may be controlled by the decibelimeter, as well as the distance between the mouth and the microphone, once monitoring the expiratory effort is almost impossible.

Calibrating the acoustic analysis program through a reference sound is also a basic procedure in this parameter measurement in most of the studies involving the LTAS (Hammarberg et al., 1986; Leino, 1993, Laukkanen, Syrja, Laitala & Leino, 2004; Pinczower & Oates, 2005). Nordemberg and Sundberg (2003), considering a intensity variation between strong and weak loudness of 2dB, demonstrated the existence of a strong linear relation between the mean sound pressure level and the resulting LTA. For the lower frequencies, the factor-gain is linear while for frequencies between 1.5-3.0 kHz, this factor is of 1dB for 1.4dB for men and 1.6dB for women, who need a greater sub-glottal pressure to obtain the same loudness. In a previous study, White and Sundberg (2000), analyzing the variation of intensity in spectra of baritones, had already observed that na increase of 10dB in the SPL resulted in a 15-20dB increase in the partials close to 2.5kHz, and that this relation is a function the log. Of the sub-glottal pressure, Figures 2 to 4 demonstrate the emissions of the same speaker in 88,6dB, 91,2dB and 95, 3dB measured at 15cm; the relation F1-f0 modifies gradually and, in a hyperkinetic adjust besides the energy increase in the F4 region the f0 becomes a lot weaker than the F1 and yet, F4 and F5 approximate and become only one peak.


LTAS and voice quality

The LTAS has been used in several studies because it allows to "quantify" the voice quality, marking the differences between sex, age, professional voice quality - spoken and singing - and dysphonic voices, contributing for the evaluation and for the follow up of training and/or treatments (Kitzing, 1986; Hammarberg, Fritzell, Gauffin, Sundberg, 1986; Leino, 1993; Mendoza et al., 1996; Cleveland, Sundberg & Stone, 2001; White, 2001; Laukkanen, Syrja, Laitala & Leino, 2004; Jorge et al., 2004).

. female and male voice. Marking the acoustic differences between male and female voices for beyond the fundamental frequency and the formant frequencies structure, the results of Mendoza et al. (1996) showed a high level of energy probably coming from the aspiration noise, for women at 3kHz corresponding to the third formant (F3) and, due to this noise, a less marked inclination of the spectral curve. The noise would be related to a posterior triangular gap common in women which would provide them with a breathy voice quality. This pattern of voice may even be "chosen" considering a socio-cultural behavior, at least among American and Spanish women who, so far have been studied through LTAS.

Comparing the LTAS in different phonation loudness, Nordemberg and Sundberg (2003) observed that the frequency of F3 is almost 20% higher for women who presented peaks at 2.9 kHz and 4.1 kHz. For men, these peaks are at 2.4 kHz and 3.4 kHz. They refer that for the same SPL of 70dB, women presented an spectral curve in average 3.5dB stronger in the region of 1-4kHz probably because they tend to need a higher degree of vocal effort to reach the same vocal intensity as men.

. child voice. White (2001), studying a group of children and adolescents of both sexes, observed a peak at 5kHz for men and a plane curve for women - less steep fall of the spectrum - during singing. The author also observed differences in the way to vary the speech intensity between genders: the girls, as the adult women, tend to speak strongly using a greater glottal adduction than giving a greater escape of airflow between the vocal chords. Sjölander (2003) confirmed, from the previous study, a relation between these findings and the auditory skill to differentiate these voices.

. senile voice. Linville and Rens (2001) researched the resonance modifications following the aging process of 80 speakers divided by age and sex, believing that under this condition there is na increase of the vocal tract extension due to disorders in the phonation structures, and that the LTAS is a sensitive tool for these changes. The acoustic findings show that the elderly from both genders present lower formant frequencies, specially the women. The results confirm the anatomical findings and, together with the previous studies' data, propose a mix model of resonance of the vocal tract and articulation pattern affecting the formant frequencies of this age group.

In another study with the same group, Linville (2002) identified differences between the elderly women's spectra compared to young's ones: greater amplitude at 340kHz and at specific points of the region 6-7 kHz and low levels of energy at 3.040 Hz and 3.2 kHz. Both female groups - young and elderly - were perceptually identified as having a breathy voice quality (specially the elderly), but at first this quality would be revealed by the energy increase in different points of the spectrum, at 3kHz and 6kHz, suggesting differences in the configuration of the glottal gap that would be more posterior for the young and more anterior for the women. Nevertheless, these spectra deserve more conclusive studies with a larger population. Young people's spectrum demonstrated less difference between the band above and below 1.6kHz., that is, a curve with smaller inclination. The elderly compared to the young presented lower levels of energy at 1.6kHz (F2), without a plausible explanation from the physiological point of view, and a tendency to increase energy at the high frequencies region. The fundamental frequency of the elderly women and men is very similar - 160Hz - and with greater amplitude than in the Young group. Soyama et al. (2005) investigated 8 individuals from both genders and found a significant increase of energy in the region from 2 to 4,5KHz for elderly men and from 6.5 to 10kHz for elderly women. They add that, despite the 60 judges have perceptually identified the genders, the acoustic analysis results through LTAS did not point this difference.

. professional voice. Sundberg (1987) identified a peak in the LTAS spectrum of lyric singers as a result from the grouping of F3, F4 and F5, that could be related to our perception of "shine" and vocal projection - the "singer formant" (FC) between 2.8-3.4kHz. For the author, this peak would be an "intelligent" response of the lyric singer to his orchestra: the orchestra works in the lower region of the spectrum and the singer works in the higher region, in order to highlight his voice. According to the author, a certain laryngeal configuration is necessary to generate a FC, where the epilarynx becomes a resonance box independently of the rest of the vocal tract, and the frequency would be around 3 kHz. This high region is precisely the same where our hearing is more sensitive, 2-5 kHz (Sundberg, 1987). For Titze (2001), the epilarynx tube in these cases narrows itself regarding the pharynx and makes difficult the airflow to the upper vocal tract, diminishing the transglottal airflow between the vocal chords and changing its vibration mode. The closing of the vocal chords phase gets smaller, increasing the upper harmonics intensity at 3 kHz. This process happens within a linear view of interaction between source and filter.

Accordingly, Leino (1993) proposes the term "actor's formant" or "speaker's formant" (Ff) for the grouping of the third, fourth and fifth formants (F3, F4 e F5) at around 3.5 kHz in projected voices of male actors. Studies performed with Finish, German, African, Swiss and Australian actors corroborate this finding (Leino, 1993; Munro, 2002; Bele, 2002; Pinczower & Oates, 2005). The nature of the actor's formant is not totally clear yet. Figure 5 shows the "actor's formant" that appears with -20dB in relation to the strongest peak of the normalized spectrum (Master et al., 2005).


From these considerations, aiming at a better understanding of the professional voices, some researches were developed with speech and with different singing styles, or to verify the possibility of bringing adjustments from the singing to the speech and vice-versa, or yet, to try to establish a correlation between parameters variation such as phonation pitch, loudness and the acoustic spectrum and the perceptual analysis. Next we present some possibilities of studies:

Figueiredo (1993) observed that the LTAS is an efficient analysis tool when the purpose is to establish the identity of a speaker through the comparison of vocal patterns from phonetic and spectrographic analysis. Navarro (2000), studying the sports speakers' emissions through different variables of acoustic and perceptual analyses, observes that the long term spectra (ELT) suggested a crepitant vocal quality for the spontaneous speech of these speakers and a fluid vocal quality for the sports narration. Cleveland et al. (2001), considering that country singers sing in a very similar way as they speak, compared the LTAS of 5 subjects during speech and during singing. They confirmed this hypothesis since a very strong peak at 3.5kHz was identified in both emissions. Barrichelo et al. (2001) examined the possibility of the opera singers take to speech the resonance effect technically acquired in singing, and responsible for the voice shine. The results suggest a higher concentration of energy at 2480-3kHz and 3480Hz, respectively for male and female singing voices in the region of the "actor/speaker formant", either in singing as in spoken voice. For both genders, the frequency of F2 was lower in the control group justifying a lower motor adjustment of the larynx during singing, which could justify the initial hypothesis. Stone et al. (2003) researched, among several acoustic measures, the voices of lyric and Broadway singers, styles associated with different vocal techniques, through LTAS. The results, obtained with a very small sample, indicated a weaker f0 and stronger partials between 0.8-1.6kHz suggesting a greater glottal adduction for the Broadway singers - similarly to the speech in a strong loudness. The differences between these two singing styles would have their origin in the glottal level and in the vocal tract resonance. Pinczower and Oates (2005), compared male actor voices in comfortable loudness and in maximum projection level and distinguished these voices through the acoustic and perceptual analyses, emphasizing that the spectrum showed higher energy concentration in higher frequencies around 3.4kHz (Ff) for the strong emissions than for

the ones in comfortable conditions.

Some studies successfully followed up the improvement of voice training, comparing teachers' emissions before and after the speech-language intervention. Munro (2002) followed up efficiently an voice training and observed a higher concentration of energy at the fundamental frequency (f0) and at the first formant region (F1) due to the approximation of these two frequencies. The authors compared sound spontaneous emissions of actors and observed that in the projected voices spectrum there was a great concentration of energy in the lower region of the spectrum, when F1 was closer to f0, and yet at 2.5kHz, 3kHz and at 4-4.5 kHz., events related to the projected voice perception. Laukkanen et al., (2004) trained the spoken voice of a group of theater students, with and without visual cues from the acoustic analysis, in real time during 2 months and could observe that in both groups there was an increase of 3-4dB at 3-5kHz in the LTA spectrum. The authors called the attention about the effectiveness of training intensity variation with visual support to avoid the development of hyperfunctional mechanisms, revealed by a F1 a lot stronger than a f0.

Bele (2002), compared Norwegian actors' and teachers' voices and observed the following differences in the LTAS: actors have emission mechanisms in strong intensities and, therefore, smaller values in the relation between f0 and F1, the "speaker formant" region is stronger for the actors but not as much as referred in the literature. According to the author, the hearing evaluation was more efficient than the LTAS in the differentiation of these voices, leading to the following question: something affects our subjective judgment of vocal quality, something that can not be objectively measured. The author observes that a peak at 3.5 kHz could also be related with nasal, rough and in fry voices, reinforcing the necessity to consider the perceptual analysis while analyzing with the LTAS.

. dysphonic voices. The LTAS does not diagnose the laryngeal disorders (Hammarberg et al., 1986). It is necessary to consider the voice quality knowing that for a same etiological diagnosis this quality may vary considerably and that a same vocal quality may be present in different laryngeal disorders.

In breathy voices or with weak loudness, the main characteristics of the spectrum are: little energy concentration at 0.4-4 kHz, corresponding to the main formants, and great concentration above the 5 kHz region (Soyama et al., 2005). The sound pressure level of f0 when compared to F1 is also stronger (Sundberg, 1987). In the hyperfunctional dysphonia, in voices with increased loudness, in tense voices and in resonant voices the spectrum envelope falls less abruptly, and the spectrum region of 2-4kHz presents higher concentration of energy, although the F1 is a lot stronger than f0 (Frokjaer-Jensen & Prytz, 1976, Kitzing, 1986, Hammarberg et al., 1986, Löfqvist, 1986, Leino, 1993).

Figure 6 shows a low and fluid voice that suffered little effect of the resonance, taking as a reference the spectrum of the glottal source that falls 12dB by octave (Sundberg, 1987). The spectrum envelope also presents a reasonably accentuated energy fall at 2-3kHz, and a small peak at 3-4Hz in -40dB, referring to F4.


A great contribution of the LTAS in the dysphonia treatment area is to enable an objective evaluation of the voice quality in the pre and post speech-language therapy and surgical interventions, specially the breathiness as a vocal chord paralysis symptom, according to Hartl et al., 2001. The authors compare two cases of laryngeal paralysis before and after the appearance of the breathiness symptom, and observed an increase of energy in the medium and high regions of the spectrum and a decrease in the lower region.

In Figure 7 we can observe a f0 a lot stronger in relation to F1 and a higher energy concentration in the spectrum from 5kHz, characteristic of breathy, weak and harmonically poor voices.


Laukkanen, Syrja, Laitala and Leino (2004) investigated the physiological, acoustic and perceptual aspects of the "throaty voice" in two cases, one male and one female. This voice quality, although not associated to laryngeal lesions, is harmful for the vocal health. Within the results, the authors related the perception of this voice quality to the increase of energy in the F1 region, decrease in the F4 and, in the anterior vowels, decrease of F2 - related to the pharynx narrowing. In the male individual, there were evidences of a hyperfunctional motor adjustment. Camargo et al. (2004) in a study with 5 dysphonic patients established positive correlations between laryngeal and supra-laryngeal adjustments verified in the vocal evaluation with phonetic motivation (perceptual evaluation) and LTAS measures, more specifically with the spectral inclination.

Conclusion

The LTAS is an acoustic analysis method sensitive to different voice qualities and, by the evidenced aspects, it is an adequate tool to complement objectively our hearing perception of this parameter. It does not have an easy working methodology, specially if the study involves the sound pressure level measurement, but it appeared to be na efficient tool for the voice quality analysis, of its more stable aspects, once it "summarizes" by na average, several memontaneous spectra, revealing the glottal source and the filter contribution for the voice quality. It is not a diagnostic method and the perceptual analysis id essential. Some aspects such as the f0, jitter, shimmer, harmonic-noise ratio and formant frequencies analysis that depend on a time resolution, are not contemplated by the LTAS and thus, other types of acoustic analysis are necessary for complementation. The fact that its possibilities and limitations are not yet well understood, and that the normalization of the parameters that we can measure in the LTAS is need, tracks for a long way of studies.

This article's review of the technical aspects involved in the LTAS elaboration and interpretation contributes for the speech-language intervention as well as for researches in the area.

Because of the rich and diverse Brazilian culture with different singing styles and so many other popular unexplored manifestations, it constitutes a vast research field.

Acknowlegment: Professor Doutor Anne Maria Laukkanen; Professor Doutor Timo Leino e Professor Doutor Paulo Augusto de Lima Pontes. Fundação para o Desenvolvimento da UNESP e Fundo de Auxílio aos Docentes e Alunos da UNIFESP.

References

Recebido em 13.10.2004.

Revisado em 28.04.2005; 23.05.2005; 29.07.2005; 06.12.2005; 06.02.2006; 14.03.2006.

Aceito para Publicação em 14.03.2006.

Artigo de Atualização

Artigo Submetido a Avaliação por Pares

Conflito de Interesse: não

  • BARRICHELO, V. O.; HEUER, J. R.; DEAN, C. M.; SATALOFF, R. T. Comparison of singer's formant, speaker's ring, and LTAS among classical singers and untrained speakers. J. Voice, v. 3, n. 15, p. 344-350, 2001.
  • BELE, I. V. Professional speaking voice: a perceptual and acoustic study of actor's and teachers voices. 2002. 253 f. Tese (Doutorado em Educação) - University of Oslo. Noruega.
  • BELE, I. V. Reliability in perceptual analysis of voice quality. J. Voice, v. 19, n. 4, p. 555-573, 2005.
  • BIEMANS, M. A. J. Gender variation in voice quality. 2000. 212 f. Dissertação (Mestrado) - Katholieke Universiteit Nijmegen. The Netherlands (Utrecht) 2000.
  • CAMARGO, Z. A. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica perceptiva e eletroglotográfica. 2002. 283f. Tese (Doutorado em Lingüística Aplicada e Estudos da Linguagem) - Pontifícia Universidade Católica, São Paulo.
  • CAMARGO, Z.; VILARIM, G. S.; CUKIER, S. Parâmetros perceptivo-auditivos e acústicos de longo termo da qualidade vocal de indivíduos disfônicos. R. Cefac, v. 6, n. 2, p. 189-196, 2004.
  • CLEVELAND, T. F.; SUNDBERG, J.; STONE, R. E. Long-term average spectrum characteristics of country singers during speaking and singing. J. Voice, v. 1, n. 15, p. 54-60, 2001.
  • FIGUEIREDO, R. M. A eficácia de medidas extraídas do espectro de longo termo para a Identificação de Falantes. Cad. Est. Ling., v. 25, p. 129-160, 1993.
  • FROKJAER-JENSEN, B.; PRYTZ, S. Registration of voice quality. Bruel Kjaer Technol. Review, v. 3, p. 3-17, 1976.
  • HAMMARBERG, B.; FRITZELL, B.; GAUFFIN, J.; SUNDBERG, J. Acoustic and perceptual analysis of vocal dysfunction. J. Phonetics, v. 14, p. 533-547, 1986.
  • HARTL, D. M.; HANS, S.; VAISSIERE, J.; RIQUET, M.; BRASNU, D. F. Objective voice quality analysis before and after onset of unilateral vocal fold paralysis. J. Voice, v. 15, n. 2, p. 351-61, 2001.
  • HARTL, D. A.; HANS, S.; VAISSIERE, J.; BRASNU, D. A. Objective acoustic and aerodynamic measures of breathiness in paralytic dysphonia. Eur. Arch. Otorhinolaryngol., v. 260, n. 4, p. 175-182, 2003.
  • JORGE, M. S.; GREGIO, F. N.; CAMARGO, Z. Qualidade vocal de indivíduos submetidos a laringectomia total: aspectos acústicos de curto e de longo termo em modalidades de fonação esofágica e traqueoesofágica. R. Cefac, v. 6, n. 3, p. 319-329, 2005.
  • JÓNSDOTTIR, V.; LAUKKANEN, A. M.; SIIKKI, I. Changes in teachers' voice quality during a working day with and without electric sound amplification. Folia Phoniatr. Logop., v. 55, n. 5, p. 267-280, 2003.
  • KITZING, P. LTAS criteria pertinent to the measurement of voice quality. J. Phonetics, v. 14, p. 477- 482, 1986.
  • LAUKKANEN, A. M.; SUNDBERG, J.; BJÖRKNER, E. Acoustic study of the "throaty" voice quality. MH-QPSR, KTH, v. 46, p. 14-24, 2004.
  • LAUKKANEN, A. M.; SYRJA, T.; LAITALA, M.; LEINO, T. Effects of two-month vocal exercising with and without spectral biofeedback on student actor's voice. Logoped. Phoniatr. Vocol, v. 29, n. 2, p. 66-76, 2004.
  • LEINO, T. Long-term average spectrum study on speaking voice quality in male actors. In: Stockholm Music Acoustics Conference, 1993, Stockholm. Proceedings of the Stockholm Music Acoustics Conference Stockholm: Royal Swedish Academy of Music, 1993. p. 206-210.
  • LÖFQVIST, A. The long time average spectrum as a tool in voice research. J. Phonetics, v. 14, n. 3, p. 471-475, 1986.
  • LINVILLE, S. E.; RENS, J. Vocal tract resonance analysis of aging voice using the long term average spectra. J. Voice, v. 15, n. 3, p. 323-330, 2001.
  • LINVILLE, S. E. Source characteristics of aged voice assessed from Long-term average spectra. J. Voice, v. 16, n. 4, p. 477-479, 2002.
  • MASTER. B.; BIASE, N.; CHIARI, B. M.; RAMOS, L. R.; LAUKKANEM, A. M. Voz projetada de atores masculinos: um estudo de emissão de longo termo (LTAS) com especial referência ao "formante do ator". In: Congresso Brasileiro de Fonoaudiologia, 13., 2005, São Paulo. Anais do XIII Congresso Brasileiro de Fonoaudiologia. Santos: Sociedade Brasileira de Fonoaudiologia - Suplemento Especial, 2005. 1 CD-ROM.
  • MENDOZA, E.; VALENCIA, N.; MUÑOZ, J.; TRUJILLO, H. Differences in voice quality between men and women: use of the long-term average spectrum. J. Voice, v. 10, n. 1, p. 59-66, 1996.
  • MEDRADO, R; FERREIRA, L. P.; BEHLAU, M. Voice-over: Perceptual and Acoustic Analysis of Vocal Features. J. Voice, v. 19, n. 3, p. 340-349, 2005.
  • MUNRO, M. Lessac tonal action in women's voices and the actor's formant: a comparative study. 2002. 235 f. Dissertação (Doutorado em Lingüística) - Potchefstroom University for Cristian Higher Education. South Africa.
  • NAVARRO, C. A. Perfil vocal e análise acústica da qualidade vocal de locutores esportivos. 2000. 107 f. Dissertação (Mestrado em Fonoaudiologia) - Pontifícia Universidade Católica. São Paulo.
  • NORDEMBERG, M.; SUNDBERG, J. Effect on LTAS of vocal loudness variation. TMH-QPSR, KTH, v. 45, p. 87-91, 2003.
  • PINCZOWER, R.; OATES, J. Voice projection in actors: the LTAS features that distinguish comfortable acting voice from voicing with maximal projection in males voice. J. Voice, v. 19, n. 3, p.440-453, 2005.
  • PINHO, S. M. R.; CAMARGO, Z. Introdução à análise da voz e da fala. In: Pinho, S. M. R. Tópicos em voz Rio de Janeiro: Guanabara Koogan, 2001.
  • SJÖLANDER, P. Perceptual relevance of the 5kHz spectral region to sex identification in children's singing voices. In: Stockholm Music Acoustics Conference, 2003, Stockholm. Proceeding of the Stockholm Music Acousics Conference. Stockholm: Royal Swedish Academy of Music, 2003. p. 503-506.
  • SOYAMA, C. K.; ESPASSATEMPO, C. L.; GREGIO, F. N.; CAMARGO, Z. Qualidade vocal na terceira idade: parâmetros acústicos de longo termo de vozes masculinas e femininas. R. Cefac, v. 7, n. 2, p. 267-279, 2005.
  • STONE, R. E.; CLEVELAND, F. T.; SUNDBERG, J. P.; PROKOP, J. Aerodynamic and acoustical measures of speech, operatic and Broadway vocal styles in professional female singer. J. Voice, v. 17, n. 3, p. 283-297, 2003.
  • SUNDBERG, J. The science of the singing voice Illinois: Northern Illinois University Press, 1987.
  • TANNER, K., ROY, N.; ASH, A.; BUDER, E. Spectral moments of the LTAS: sensitive index of voice change after therapy? J. Voice, v. 19, n. 2, p. 211-222, 2005.
  • TITZE, I R. Acoustic interpretation of resonant voice. J. Voice, v. 15, n. 4, p. 519-28, 2001.
  • WHITE, P.; SUNDBERG, J. Spectrum effects of subglottal pressure variation in professional baritones singers. TMH-QPSR, KTH, v. 4, p. 29-32, 2000.
  • WHITE, P. Long-term average spectrum analysis of sex- and gender-related differences in children's voice. Logoped. Phonetics Vocol., v. 26, n. 3, p. 97-101, 2001.
  • Endereço para correspondência:
    Suely Master
    Rua Dom Luís Lasagna, 400
    São Paulo - SP - CEP 04266-030
    (
  • *
    Trabalho Realizado na Universidade Federal de São Paulo – Escola Paulista de Medicina.
  • Publication Dates

    • Publication in this collection
      15 May 2006
    • Date of issue
      Jan 2006

    History

    • Received
      13 Oct 2004
    • Reviewed
      28 Apr 2005
    • Accepted
      14 Mar 2006
    Pró-Fono Produtos Especializados para Fonoaudiologia Ltda. Condomínio Alphaville Conde Comercial, Rua Gêmeos, 22, 06473-020 Barueri , São Paulo/SP, Tel.: (11) 4688-2220, Fax: (11) 4688-0147 - Barueri - SP - Brazil
    E-mail: revista@profono.com.br