Accuracy of traditional and formant acoustic measurements in the evaluation of vocal quality

Lopes, Leonardo Wanderley; Alves, Jônatas do Nascimento; Evangelista, Deyverson da Silva; França, Fernanda Pereira; Vieira, Vinícius Jefferson Dias; Lima-Silva, Maria Fabiana Bonfim de; Pernambuco, Leandro de Araújo

ABSTRACT

Purpose

Investigate the accuracy of isolated and combined acoustic measurements in the discrimination of voice deviation intensity (GD) and predominant voice quality (PVQ) in patients with dysphonia.

Methods

A total of 302 female patients with voice complaints participated in the study. The sustained /ɛ/ vowel was used to extract the following acoustic measures: mean and standard deviation (SD) of fundamental frequency (F₀), jitter, shimmer, glottal to noise excitation (GNE) ratio and the mean of the first three formants (F1, F2, and F3). Auditory-perceptual evaluation of GD and PVQ was conducted by three speech-language pathologists who were voice specialists.

Results

In isolation, only GNE provided satisfactory performance when discriminating between GD and PVQ. Improvement in the classification of GD and PVQ was observed when the acoustic measures were combined. Mean F₀, F2, and GNE (healthy × mild-to-moderate deviation), the SDs of F₀, F1, and F3 (mild-to-moderate × moderate deviation), and mean jitter and GNE (moderate × intense deviation) were the best combinations for discriminating GD. The best combinations for discriminating PVQ were mean F₀, shimmer, and GNE (healthy × rough), F3 and GNE (healthy × breathy), mean F ₀, F3, and GNE (rough × tense), and mean F₀ , F1, and GNE (breathy × tense).

Conclusion

In isolation, GNE proved to be the only acoustic parameter capable of discriminating between GG and PVQ. There was a gain in classification performance for discrimination of both GD and PVQ when traditional and formant acoustic measurements were combined.

Keywords
Voice; Accuracy; Acoustic Analysis; Vocal Quality; Voice Disorders

RESUMO

Objetivo

Investigar a acurácia das medidas acústicas, isoladas e combinadas, na discriminação da intensidade do desvio vocal (GG) e da qualidade vocal predominante (QVP) em pacientes disfônicos.

Método

Participaram 302 pacientes do gênero feminino, com queixa vocal. A partir da vogal /ɛ/ sustentada, foram extraídas as medidas acústicas de média e desvio padrão (DP) da frequência fundamental (F₀), o jitter, o shimmer e o Glottal to noise excitation (GNE) e a média dos três primeiros formantes (F1, F2, F3). A avaliação perceptivo-auditiva do GG e QVP foi realizada por três fonoaudiólogos especialistas em voz.

Resultados

Isoladamente, apenas o GNE obteve desempenho satisfatório na discriminação do GG e da QVP. Houve uma melhora na classificação do GG e QVP com a combinação das medidas acústicas. A média de F₀, F2 e GNE (saudável × desvio leve a moderado), DP de F₀, F1 e F3 (leve a moderado × desvio moderado), Jitter e GNE (moderado × desvio intenso) foram as melhores combinações para discriminar o GG. As melhores combinações para discriminação da QVP foram média de F₀, Shimmer e GNE (saudável × rugosa), F3 e GNE (saudável × soprosa), média de F₀, F3 e GNE (rugosa × tensa), média de F₀, F1 e GNE (soprosa × tensa).

Conclusão

De forma isolada, o GNE mostrou-se o único parâmetro acústico capaz de discriminar o GG e a QVP. Houve um ganho no desempenho da classificação com a combinação das medidas acústicas tradicionais e formânticas, tanto para a discriminação do GG como da QVP.

Descritores
Voz; Acurácia; Análise Acústica; Qualidade Vocal; Distúrbios da Voz

INTRODUCTION

The voice is essentially a multidimensional phenomenon that includes physiological, perceptual, aerodynamic, acoustic and emotional aspects. Therefore, it is necessary that voice evaluations also follow this principle and that these dimensions are considered and integrated in the process to achieve an overall view of dysphonia⁽¹1 Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G, et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating assessment techniques. Eur Arch Otorhinolaryngol. 2001;258(2):77-82. http://dx.doi.org/10.1007/s004050000299. PMid:11307610.
http://dx.doi.org/10.1007/s004050000299... ⁾.

The goal of voice evaluation is to analyze voice quality, identify whether the voice is healthy or not, diagnose the presence of a perturbation, determine a prognosis, and monitor the patient's progress during voice therapy⁽²2 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200... ⁾. The process of voice evaluation generally includes procedures relating to a visual laryngeal examination, auditory-perceptual voice evaluation, acoustic analysis, aerodynamic evaluation and voice self-evaluation⁽¹1 Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G, et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating assessment techniques. Eur Arch Otorhinolaryngol. 2001;258(2):77-82. http://dx.doi.org/10.1007/s004050000299. PMid:11307610.
http://dx.doi.org/10.1007/s004050000299... ⁾.

Auditory-perceptual analysis is considered the primary reference standard used by the speech therapist when performing voice evaluations⁽²2 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200... ⁾. It is considered a subjective method, as it depends on the evaluator's judgment and has an exclusively impressionistic nature⁽²2 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200... ^,³3 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328... ⁾. This type of evaluation provides information about the characterization of voice deviation intensity, as well as the predominant voice quality⁽⁴4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ⁾.

Acoustic analysis is a more objective procedure. It is noninvasive and is becoming increasingly used in the voice clinic. In traditional acoustic analysis, two types of measure are used, perturbation measures (jitter and shimmer) and noise measures. Jitter indicates the variability of the fundamental frequency in the short term and is measured between neighboring glottal cycles. Shimmer corresponds to variability in the sound wave amplitude over the short term. Glottal-to-noise excitation (GNE) measures the additional noise in the sound signal, irrespective of the noise modulated by the glottal mechanism, indicating the source of the voice signal and whether it comes from vocal fold vibration or from turbulent airflow generated in the vocal tract. Measures of the perturbation and noise are therefore focused on the glottal source⁽³3 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328...

4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ^-⁵5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97... ⁾.

In addition to these measures, some measures are related to the resonance of the sound wave in the vocal tract, which changes according to the different configurations of the vocal tract structure positioning and volume of the resonance cavities during voice production. Such measures are called formants and correspond to energy concentrations along the vocal tract⁽³3 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328...

4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...

5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97... ^-⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970. ⁾.

The vocal tract has a three-dimensional configuration and the sound that is produced in the glottis is modified by the positioning of structures such as the larynx, soft palate, tongue, lips and jaw. The frequencies of the glottal signal that are reinforced by the supraglottic vocal tract are called formants, and their analysis provides information about adjustments being made in the supraglottic vocal tract⁽⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.

8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.

9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017... ^-¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ⁾.

Adjustments in the positioning of the articulators and in the volume of the resonance cavities determine the values of formants⁽⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996. ^-⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ^,¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ⁾. Thus, an increase in the first formant (F1), for example, is related to a downward jaw adjustment, anterior lowering of the tongue and pharyngeal narrowing. An anterior adjustment of the tongue which is then lowered generates an increase in the second formant (F2). The formation of a smaller cavity immediately behind the incisors can raise the value of the third formant (F3) ⁽⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996. ^-⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ^,¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ^,¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ⁾.

In this context, there is a strong interaction between the source producing the sound (glottis) and the filter. The feedback from pressure encountered by the sound wave in the vocal tract modifies the glottal airflow and vocal fold vibration mode⁽¹²12 Titze I, Palaparthi A. Sensitivity of source-filter interaction to specific vocal tract shapes. IEEE Trans Audio Speech Lang Process. 2016;24(12):2507-15. http://dx.doi.org/10.1109/TASLP.2016.2616543.
http://dx.doi.org/10.1109/TASLP.2016.26... ⁾.

Some studies⁽⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.

9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017... ^-¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ^,¹³13 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96.

14 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200... ^-¹⁵15 Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137. ⁾ have observed that patients with a voice disorder make adjustments not just in the glottis but also in the supraglottis. These patients have lower formant values (F1, F2, F3) than individuals without a voice disorder⁽¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ^,¹³13 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96. ^,¹⁵15 Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137. ⁾.

Thus, these adjustments may be related to the development or maintenance of, or may cooccur with, voice disorders⁽¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ^,¹³13 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96. ⁾. Such adjustments are not necessarily evaluated by traditional acoustic measures, as they focus on the glottal source⁽¹⁶16 Schwartz SR, Cohen SM, Dailey SH, Rosenfeld RM, Deutsch ES, Gillespie MB, et al. Clinical practice guideline: hoarseness (dysphonia). Otolaryngol Head Neck Surg. 2009;141(3, Supl 2):S1-31. http://dx.doi.org/10.1016/j.otohns.2009.06.744. PMid:19729111.
http://dx.doi.org/10.1016/j.otohns.2009... ⁾.

Notably, acoustic analysis does not replace auditory-perceptual analysis but rather integrates the auditory and physiological levels⁽⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996. ^-⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ⁾. A combination of acoustic and perceptual auditory measures increases the accuracy in determining the presence or absence of a voice disorder and the intensity of the deviation present⁽¹⁷17 Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
http://dx.doi.org/10.1016/j.jvoice.2005... ^,¹⁸18 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201... ⁾.

For this reason, it is important to investigate whether a combination of measures relating to the source (perturbation and noise) and filter (formantic measures) allows a better classification of voice signals in regard to deviation intensity and predominant voice quality.

This study therefore aims to investigate the accuracy of both isolated and combined traditional acoustic and formantic measures in the discrimination of the voice deviation intensity and predominant voice quality in dysphonic patients. To carry out this study, we start from the hypothesis that a combination of traditional acoustic and formantic measures will improve the discrimination of voice deviation intensity and that a combination of traditional acoustic and formantic measures can improve the discrimination between different predominant voice qualities.

METHODS

Study design

This was a descriptive, cross-sectional, observational study, evaluated and approved by the Ethics Committee of the Health Sciences Center, Federal University of Paraíba (UFPB), under protocol number 52492/12. All participants signed a free and informed consent form authorizing the study.

Sample

Patients treated at the Department of Speech Therapy's Voice Laboratory (UFPB) in the period between April 2012 and July 2015 participated in this study. The following eligibility criteria were considered for participation:

Being female, given the relationship between this variable and the mean F₀ measure, which is associated with the anatomical characteristics of the vocal folds, which are unequal between adult males and females⁽¹⁶16 Schwartz SR, Cohen SM, Dailey SH, Rosenfeld RM, Deutsch ES, Gillespie MB, et al. Clinical practice guideline: hoarseness (dysphonia). Otolaryngol Head Neck Surg. 2009;141(3, Supl 2):S1-31. http://dx.doi.org/10.1016/j.otohns.2009.06.744. PMid:19729111.
http://dx.doi.org/10.1016/j.otohns.2009... ⁾. Furthermore, there is a higher prevalence of voice disorders in this population ⁽¹⁹19 Cohen SM, Pitman MJ, Noordzij JP, Courey M. Management of dysphonic patients by otolaryngologists. Otolaryngol Head Neck Surg. 2012;147(2):289-94. http://dx.doi.org/10.1177/0194599812440780. PMid:22368039.
http://dx.doi.org/10.1177/0194599812440... ⁾;
Being over 18 and below 65 years of age, thus avoiding the periods of voice change and presbyphonia, respectively;
Presenting a voice complaint, answering positively to the following question: “Do you consider that you have a voice problem now or have had one during the past six months?”;
Having undergone a laryngeal visual examination and having an otorhinolaryngological report.

Of the total of 530 patients evaluated in the laboratory, 96 were male, 75 were under 18 or over 65 years of age and 57 individuals had no voice complaints. Thus, 228 individuals were excluded because they did not meet the eligibility criteria, leaving a final sample of 302 patients with a mean age of 39.25(±12.63) years. No patient had neurological or cognitive impairments that prevented voice recording.

All sample patients presented a laryngeal report at the time of data collection, as described below: 78 (25.85%) patients had vocal nodules, 63 (20.86%) had no structural or functional changes in the larynx, 41 (13.57%) had vocal cysts, 35 (11.60%) had hyperemia secondary to laryngopharyngeal reflux, 24 (7.94%) had a middle-posterior triangular gap, 24 (7.94%) had vocal fold polyps, 18 (5.96%) had unilateral vocal fold paralysis, 11 (3.64%) had a vocal sulcus and 8 (2.64) had Reinke’s edema.

Procedures

All data collection for this study was conducted in the Department of Speech Therapy's Voice Laboratory (UFPB) during the initial voice evaluation session. During this session, the patients were evaluated by means of a form containing questions relating to personal information and voice complaints. They completed voice self-evaluation questionnaires and underwent the recording of speech tasks.

Only the personal identification, voice complaint and sustained vowel sample data were used for this study, as described later.

The voices were collected in a recording booth with soundproofing and a noise level below 50 dB SPL, with a 44000-Hz sampling rate at 16 bits per sample and a 10-cm distance between the microphone and the patient's mouth. Fonoview software, version 4.5, CTS Informática was used on a Dell all-in-one desktop, with a Senheiser E-835 unidirectional cardioid microphone located on a pedestal and coupled to a U-Phoria UMC 204 Behringer preamplifier.

For the voice recording collection, the patient remained standing facing the pedestal at the recommended distance between the mouth and microphone. The patient received instructions about the voice collection, and recording began soon after. During the recording, the patient was asked to emit the sustained /Ɛ/ vowel at a frequency and intensity self-reported as comfortable and normal. The /Ɛ/ vowel was selected for this study because it is an oral, open vowel, is not round and is considered to be the vowel with the most average position in Brazilian Portuguese, which facilitates a more neutral and intermediate position of the vocal tract. In addition, it is the most commonly used vowel for evaluating voice quality in Brazil⁽²⁰20 Gonçalves MIR, Pontes PAL, Vieira VP, Pontes AAL, Curcio D, Biase NG. Função de transferência das vogais orais do Português brasileiro: análise acústica comparativa. Rev Bras Otorrinolaringol. 2009;75(5):680-4. ⁾.

Subsequently, the voices were edited using SoundForge software, version 10.0. The first and final two seconds of the vowel emission were removed due to the greater irregularity in these sections, with a minimum time of three seconds being retained for each emission. The signals were normalized for the auditory-perceptual evaluation, using SoundForge's “normalize” control in peak level mode, to standardize the audio output at between -6 and 6 dB.

The acoustic measures of the fundamental frequency (mean and standard deviation), jitter, shimmer and glottal-to-noise excitation (GNE) were extracted manually using the voice quality analysis module of VoxMetria software, version 4.7h (CTS Informática, Pato Branco, Paraná, Brazil). The reference values in that software for the jitter, shimmer and GNE parameters are 0.6, 6.5 and 0.5%, respectively. Values greater than those cited for the jitter and shimmer are considered deviated, while values lower than that cited for the GNE may be considered deviated.

Praat software, version 5.3.77h, was used to extract the formantic measures from the vowel’s representation in a broadband spectrogram containing the first three formants (F1, F2, and F3). Due to the large number of estimations involved, a script was used (a tool that automatically extracts, in a standardized manner, the parametric measures investigated), which facilitated the optimization of processing time and avoided possible handling errors during the estimation procedures. The means and standard deviations of the formant frequencies were extracted for each sample. All values were then checked, and no outliers were identified.

The auditory-perceptual evaluation was performed independently by three speech therapists who were voice specialists with over 10 years of experience in this type of analysis. A visual analogue scale (VAS) ranging from 0 to 100 mm was used⁽²¹21 Ozkan H. A Comparison of classification methods for telediagnostics of Parkinson’s disease. Entropy. 2016;18(115):1-14. ⁾ to evaluate the voice deviation intensity (general grade [GG]), of the sustained vowel. A score closer to 0 represents a lower voice deviation, and one closer to 100 a greater voice deviation.

Before the auditory-perceptual evaluation, eight sustained /Ɛ/ vowel anchor stimuli were used for the training of the judges. These contained two samples of individuals with normal voice quality variability (NVQV), two samples of individuals with mild to moderate voice deviations, two samples of individuals with moderate voice deviations and two samples of individuals with intense voice deviations. All the files presented contained female voices. The judges were asked to listen to the anchor stimuli immediately prior to analyzing the voices for this study. All samples selected for this training were previously analyzed by speech therapists with experience in voice analysis and were routinely used for perceptual auditory training and as anchor stimuli in the laboratory where this study was conducted.

The perceptual evaluation session took place in a silent environment. First, each judge was told that the voices should be considered as having NVQV when they were socially acceptable, produced naturally, and without effort, noise or unstable conditions during emission. They were also instructed that roughness would correspond to the presence of vibratory irregularities, breathiness would be related to the audible escape of air during the emission and tension would correspond to the perception of vocal effort during the emission.

The auditory-perceptual parameters of roughness, breathiness and tension were chosen to characterize the signals in this study because they are universally used to characterize voice quality deviation⁽²2 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200... ⁾ and because they have known correlates on the physiological and acoustic planes.

For the evaluation, each sustained vowel emission was presented three times through a speaker at a comfortable intensity as self-reported by the evaluators. The judges then identified the presence or absence of voice deviation, the predominant voice quality in the deviated voices (rough, breathy or tense) and, finally, made a judgment as to the voice deviation intensity.

The VAS was subsequently converted into a numerical scale, with values from 1 to 4, wherein grade 1 represented individuals with NVQV (0-35.5 mm), grade 2 represented subjects with mild to moderate deviation (35.6 to 50.5 mm), grade 3 represented a moderate deviation (50.6 to 90.5 mm) and grade 4 represented an intense deviation (> 90.5 mm) ⁽²²22 Yamasaki R, Madazio G, Leão SHS, Padovani M, Azevedo R, Behlau M. Auditory-perceptual evaluation of normal and dysphonic voices using the voice deviation scale. J Voice. 2017;31(1):67-71. http://dx.doi.org/10.1016/j.jvoice.2016.01.004. PMid:26873420.
http://dx.doi.org/10.1016/j.jvoice.2016... ⁾.

At the end of the auditory-perceptual evaluation, 10% of the samples were randomly repeated to evaluate the reliability of the judges' analysis using Cohen's kappa coefficient. The auditory-perceptual analysis results of the judge with the greatest reliability (kappa coefficient of 0.79) were selected for use in this study. The other two judges had kappa values of <0.70.

The patients were categorized into two groups according to the auditory-perceptual analysis results as follows: 33 patients with NVQV (GG≤35.5 mm) and 269 patients with voice quality deviations (GG≥35.6 mm). Of the patients with voice quality deviations, 150 were classified as mild to moderate (35.6≤GG≤50.5 mm), 112 as moderate (50.6≤GG≤90.5 mm) and 7 as having an intense deviation (GG> 90.5 mm). Of the 269 patients with voice quality deviations, 135 (50.18%) had a predominantly rough voice quality, 95 (35.31%) had a predominantly breathy voice quality and 39 (14.49%) had a predominantly tense voice quality.

The otorhinolaryngological reports of the 33 NVQV patients showed voice complaints and a lack of structural and functional laryngeal changes. Of the 269 patients with voice quality deviations, all had voice complaints; 30 had a medical diagnosis of an absence of structural and functional laryngeal changes, and 239 were diagnosed with laryngeal changes, as described above.

This sample characterization is consistent with the literature, as there is no direct relationship between the presence of a voice complaint, the presence of voice quality deviation and the presence of laryngeal changes⁽⁵5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97... ⁾. Therefore, given that the purpose of this study was not to evaluate the acoustic parameters according to the presence or not of a speech disorder but to clarify the relationship between auditory-perceptual parameters and acoustic measures in evaluating the intensity and type of voice deviation, we decided not to exclude individuals with voice complaints but no laryngeal changes. These criteria strengthen the internal validity of the study and ensure that the independent variable (auditory-perceptual evaluation) is the only or most likely explanation for the effect on the dependent variable (acoustic parameters).

Data analysis

Descriptive statistical analyses were performed for all variables, including the mean and standard deviation values. Quadratic discriminant analysis (QDA) was performed to classify the signals as a function of the GG and predominant voice quality, with K -fold cross-validation used as an auxiliary method.

QDA was selected for this study because it allows identifying individual and combined variables that best discriminate between pre-established groups (GG and predominant voice quality). Eight acoustic measures were analyzed in the combined measure analysis and were combined 2 by 2, 3 by 3, 4 by 4, up to 8 by 8.

In the K-fold cross validation method, the classification was performed ten times, varying the data set, which is used for training and testing without repetition, so that more accurate results can be obtained⁽²²22 Yamasaki R, Madazio G, Leão SHS, Padovani M, Azevedo R, Behlau M. Auditory-perceptual evaluation of normal and dysphonic voices using the voice deviation scale. J Voice. 2017;31(1):67-71. http://dx.doi.org/10.1016/j.jvoice.2016.01.004. PMid:26873420.
http://dx.doi.org/10.1016/j.jvoice.2016... ⁾. Thus, signals with different GGs and predominant voice qualities were randomly divided into subsets, with a minimum of 10 signals in each subset, as this minimum number of signals facilitates the best error estimates. Signals with strong deviations were excluded from the analysis because they did not satisfy the condition of having a minimum of 10 signals.

These subsets were compared by the means of the cross-validation procedure, and for each iteration between subsets, performance measures (accuracy, sensitivity and specificity) were obtained for the classifier when discriminating the GG or predominant voice quality. At the end of all subset iterations, the mean and standard deviation values of the formed subsets were extracted and used to interpret the final classifier data.

Accuracy, sensitivity and specificity measures were used to evaluate the classifier's performance. In general, the interpretation of the sensitivity and specificity measures is most evident when the groups being compared belong to a healthy (no changes) or pathologic (with changes) class⁽²³23 Hosmer DW, Lemeshow S. Applied logistic regression. New York: Willey; 2000. http://dx.doi.org/10.1002/0471722146.
http://dx.doi.org/10.1002/0471722146 ... ⁾. Therefore, when performing discriminant analysis between classes with changes, such as performed in this study (when different deviation and predominant voice quality intensities were compared), it is necessary to determine in the classifier used the signal group that will have its correct classification measured by the sensitivity and the group that will have its correct classification measured by the specificity.

Therefore, a standard procedure was adopted in which the first condition presented in each table would correspond to the signal that would be classified correctly by the specificity, while the second condition would be classified correctly by the sensitivity ( Box 1 ).

Thumbnail

Chart 1
Discrimination cases and their respective sensitivity and specificity measures

The classification performance took into account signals with different GGs and different predominant voice qualities. The individual power of each of the considered acoustic measures and possible combinations of these measures were also considered, identifying those that provided the best classification rates between voice signals under the conditions established in this study.

Considering that the accuracy can be classified as excellent (> 90%) good (80%-90%), acceptable (70%-80%), poor (60%-70%) or with no acceptable discrimination ability (<60%) ⁽²³23 Hosmer DW, Lemeshow S. Applied logistic regression. New York: Willey; 2000. http://dx.doi.org/10.1002/0471722146.
http://dx.doi.org/10.1002/0471722146 ... ⁾, only classifications with a performance of over 70% were analyzed. Discriminant analysis (accuracy, sensitivity and specificity) was performed using MATLAB® software, version 7.9.

RESULTS

Tables 1 and 2 show the means and standard deviations of the acoustic measures as a function of GG and predominant voice quality, respectively. These data will not be examined separately but in conjunction with the performance of the classifications used.

Thumbnail

Table 1
Means and standard deviations of acoustic measures at different voice deviation intensities

Thumbnail

Table 2
Means and standard deviations of acoustic measures according to predominant voice quality

First, the accuracy of the isolated acoustic measures in discriminating the GG in the patients was tested. The GNE measure had the best performance (70.95%, SD = 3.05), achieving a sensitivity of 86.67±5.44% and specificity of 55.83±5.13% ( Table 3 ).

Thumbnail

Table 3
Accuracy, sensitivity and specificity of the best isolated acoustic measures and best acoustic measure combinations in the discrimination of voice deviation intensity

When investigating the discriminatory power of the combined acoustic measures in the classification of GG in the investigated sample, the greatest accuracy was found in the following combinations: the means of F₀, F2 and GNE (75.24±4.86%) when distinguishing between NVQV and mild to moderate deviations; and the SDs of F₀, F1, F3, jitter and GNE (74.02±3.26%) when discriminating between mild and moderate deviations ( Table 3 ).

The accuracy of the isolated measures in the discrimination of predominant voice quality was analyzed next. GNE performed best in discriminating between NVQV and rough (73.57%±5.56), between NVQV and breathy (82.38±3.73%) and between breathy and tense (71.43%±4.76) ( Table 4 ).

Thumbnail

Table 4
Accuracy, sensitivity and specificity of the best isolated acoustic measures and best acoustic measure combinations in the discrimination of predominant voice quality

Finally, the performance of the combined acoustic measures in the discrimination of the voice quality was tested. The means of F₀, shimmer and GNE (78.57±4.21%) were the best combination when discriminating between NVQV and rough voice quality. The means of F3 and GNE (84.05±3.29%) were the best combination for distinguishing between NVQV and breathy voice quality. The means of F₀, F3, and GNE (73.75%±3.75) were selected as the best combination for discriminating between rough and tense voices. The combination of the means of F₀, F1 and GNE (75.71±6.41%) offered the best performance when discriminating between breathy and tense voices ( Table 4 ).

DISCUSSION

This study investigated the accuracy of both isolated and combined traditional acoustic and formantic measures in the discrimination of GG and predominant voice quality in dysphonic patients. Two hypotheses were raised as follows: 1) the combination of traditional acoustic and formantic measures improves the discrimination of GG in voices, and 2) the combination of traditional acoustic and formantic measures improves the discrimination of different predominant voice qualities. Thus, the discussion section was organized to elucidate the conclusions reached with regard to these hypotheses.

Traditional acoustic and formantic measures in the discrimination of voice deviation intensity

When analyzing the isolated acoustic measures, only GNE showed acceptable performance (70.95±3.05%) in the discrimination between NVQV voices and voices with mild to moderate deviations, with higher sensitivity (86.67%±5.44) in the correct identification of signals with deviation.

The GNE measure appeared to be lower in patients with mild to moderate deviation than in individuals with NVQV. However, this measure did not produce values in either of the two groups that were below the 0.5% cut-off point considered for the presence of deviation in this parameter. In turn, in the comparative analysis, it could be inferred that patients with mild to moderate voice deviation had more silent airflow between the vocal folds than those with NVQV⁽⁵5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97... ^,¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ⁾.

A study⁽⁴4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ⁾ conducted with 226 patients, 53 healthy controls and 173 patients with voice deviations demonstrated that GNE showed excellent accuracy (95%) when differentiating between healthy voices and those with deviations. Thus, it may be inferred that GNE is a good voice evaluation measure because it shows greater discrimination between healthy and deviated voices.

Based on the analysis of the combined acoustic measures, the hypothesis that a combination of traditional and formantic measures would improve the performance of the classifier in the discrimination of GG was confirmed. In addition to increasing the accuracy and specificity values, the combination of measures was able to discriminate between mild to moderate and moderate deviations, which the isolated measures could not. The combination of measures relating to the means of F₀, F2 and GNE obtained an accuracy of 75.24%± 4.86% when discriminating between signals with NVQV and those with mild to moderate deviations. Patients with mild to moderate deviations had lower GNE values and greater mean F₀ and F2 values than did patients with NVQV.

Lower GNE values may indicate inefficient glottal closure, more additive noise in the voice and a possible decrease in intensity⁽⁴4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ^,⁵5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97... ^,²⁴24 González CMT, Hernandez JBA, Orozco-Arroyave JR, Casals JS, Gallego-Jutgla E. Automatic detection of laryngeal pathologies in running speech based on the HMM transformation of the nonlinear dynamics. Lect Notes Comput Sci. 2013;1:136-43. ⁾. In turn, data in the present study in regard to GNE were analyzed comparatively between groups as no values were below the cutoff in either group of signals.

The mean F₀ values found were linked to the presence of longitudinal vocal fold tension, which causes a greater number of glottic cycles per second, resulting in a greater F₀ elevation⁽²⁵25 Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
http://dx.doi.org/10.1016/j.jvoice.2009... ⁾.

Increased F2 values are related to adjustments in the tongue anteriorization ⁽⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996. ^-⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ⁾. Such adjustments promote the elevation of the laryngeal complex, and by means of a biomechanical action, there is a greater longitudinal tension in the vocal folds, with a consequent rise in F₀, increased vocal effort and decreased voice projection ⁽¹⁴14 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200... ^,²⁵25 Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
http://dx.doi.org/10.1016/j.jvoice.2009... ⁾.

A study⁽²⁶26 Macari AT, Ziade G, Turfe Z, Chidiac A, Alam E, Hamdan AL. Correlation between the position of the hyoid bone on lateral cephalographs and formant frequencies. J Voice. 2016;30(6):757.e21-6. http://dx.doi.org/10.1016/j.jvoice.2015.08.020. PMid:26604010.
http://dx.doi.org/10.1016/j.jvoice.2015... ⁾ analyzed the formantic measures of sustained vowels and found an increase in the values of these measures when the laryngeal complex was elevated. Furthermore, F₀ values decreased when the vocal tract length increased (low larynx) and similarly increased when the vocal tract length decreased (high larynx).

It may be inferred from these findings that compared to individuals without voice quality deviation, patients with a mild to moderate degree of deviation may implement supraglottic adjustments to compensate for dysfunctional glottic conditions with the presence of increased silent airflow. These findings are consistent with other studies⁽⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.

9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017... ^-¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ^,¹³13 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96.

14 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200... ^-¹⁵15 Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137. ⁾ showing that dysphonic patients tend to make adjustments in the vocal tract to compensate for their voice problem.

Nonetheless, one can question whether the supraglottic adjustment may be related to the source of the voice problem in these patients as the elevation of the larynx with increased longitudinal vocal fold tension reduces the convexity of the curvature of the free edge of the vocal folds, which is one of the mechanisms responsible for increased transglottic silent airflow ⁽²⁷27 Samlan RA, Story BH, Bunton K. Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computacional modeling. J Speech Lang Hear Res. 2013;56(4):1209-23. http://dx.doi.org/10.1044/1092-4388(2012/12-0194). PMid:23785184.
http://dx.doi.org/10.1044/1092-4388(201... ⁾.

In general, the description and analysis of the formantic measures in the group with mild to moderate deviations seems to be interesting for understanding the supraglottic adjustments made by these patients, which may have implications for clinical evolution in voice therapy.

The measure combination of the SD of F_0, F1, F3, jitter and GNE also had an acceptable performance (74.02±3.26%) when discriminating between signals with mild to moderate deviation and those with moderate deviation. The measures of the SD of F₀, F1, F3 and jitter were higher in patients with moderate deviation, while the GNE values were lower in these patients than in individuals with mild to moderate deviation. In regard to the reference values for the GNE and jitter measures, only the latter produced values above the cutoff point for being considered deviated.

In physiological terms, the SD of F₀ is directly related to the neuromuscular condition and vocal fold mucosa vibration regularity; thus, higher F₀ SD values, as found in patients with moderate deviations, may indicate phonatory instability and greater vocal fold vibration irregularity, thereby causing deviations in voice production ⁽²⁴24 González CMT, Hernandez JBA, Orozco-Arroyave JR, Casals JS, Gallego-Jutgla E. Automatic detection of laryngeal pathologies in running speech based on the HMM transformation of the nonlinear dynamics. Lect Notes Comput Sci. 2013;1:136-43. ^,²⁵25 Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
http://dx.doi.org/10.1016/j.jvoice.2009... ⁾.

Jitter evaluates perturbations in the frequency of the neighboring vibration cycles ⁽¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ^,¹⁸18 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201... ⁾ and is the measure most correlated with GG⁽¹⁷17 Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
http://dx.doi.org/10.1016/j.jvoice.2005... ⁾ and sensitive to the presence of voice deviations. This explains its increase in individuals with moderate voice deviations in this study.

These data suggest that patients with moderate voice deviations have more irregular vocal fold vibrations and phonatory instability (increased SD of F₀), greater silent airflow, more noise in the voice (decreased GNE) and a greater overall intensity of voice deviation (increased jitter) than do patients with mild to moderate deviations.

The increase in F1 values is related to the greater lowering of the oromandibular complex and to oropharyngeal narrowing⁽⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996. ^-⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ^,¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ^,¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ⁾. These cited supraglottic adjustments may occur as a compensation for dysfunctional glottic conditions, as a greater degree of jaw opening and pharyngeal narrowing may cause a decrease in auditorily perceived breathiness⁽²⁷27 Samlan RA, Story BH, Bunton K. Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computacional modeling. J Speech Lang Hear Res. 2013;56(4):1209-23. http://dx.doi.org/10.1044/1092-4388(2012/12-0194). PMid:23785184.
http://dx.doi.org/10.1044/1092-4388(201... ⁾ and increased voice intensity⁽⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.

9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017... ^-¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ^,¹⁷17 Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
http://dx.doi.org/10.1016/j.jvoice.2005... ⁾. An increase in F1 is also associated with the phonatory effort present in dysphonic patients with muscular tension⁽¹⁴14 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200... ⁾.

The hypothesis that a combination of traditional acoustic and formantic measures can improve discrimination in regard to GG was confirmed. The information seems to have a complementary nature, as formantic measures alone did not show acceptable discriminatory performance in the cases studied. Notably, in this study, an auditory-perceptual rating scale focused on the glottal source was used; therefore, one would expect a greater contribution from acoustic measures related to the glottal source.

However, more deviated voices seem to make greater supraglottic adjustments, as the higher values found in the combination of measures would be related to sensitivity, i.e., indicate the most deviated signals correctly.

Traditional acoustic and formantic measures in the discrimination of predominant voice quality

When analyzing acoustic measures alone, only GNE had an acceptable performance when discriminating between voices in terms of predominant voice quality.

In regard to the discrimination between NVQV and rough voices, an accuracy of 73.57±5.56% was found, with greater sensitivity (88.33±4.84%) for the correct identification of rough voices. Regarding the NVQV vs. breathy discrimination, an accuracy of 82.38±3.73% was found, with greater sensitivity (87.50±5.16%) in the correct identification of breathy voices. In the breathy vs. tense discrimination, an accuracy of 71.43±4.76% was found, with greater specificity (81.67±4.08%) in the correct identification of breathy voices.

Once again, in an isolated form, only the GNE measure showed acceptable values in the discrimination of the different voice qualities. In this context, GNE proved especially important in differentiating breathy voices from other voice types. This finding is probably because GNE is directly related to the source of the voice signal, i.e., whether it comes from vocal fold vibration or turbulent airflow generated in the vocal tract⁽⁴4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ^,⁵5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97... ⁾. This factor could explain the direct relationship with this parameter.

The hypothesis that a combination of traditional acoustic and formantic measures can improve the discrimination of predominant voice quality was confirmed, as the combination of these measures improved the performance of the classifier when discriminating between NVQV and rough, NVQV and breathy and breathy and tense voices. It also provided acceptable discrimination between rough and tense voices.

When discriminating between NVQV and rough voices, the best combination found was the measures of the means of F₀, shimmer and GNE. This combination had an accuracy of 78.57±4.21% and greater sensitivity (87.50±5.16%) in the correct identification of rough voices. The means of F₀ and shimmer values were higher in patients with a rough voice quality, while the GNE values were reduced in relation to VNQN voices.

In general, it is expected that rough voices will have lower F₀ values⁽¹⁸18 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201... ⁾. However, the increase of this measure in this study may be explained by the fact that patients with rough voices possibly had tension associated with emission and that, therefore, there was an increase in F₀⁽²2 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200... ^,¹⁴14 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200... ^,²⁸28 Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
http://dx.doi.org/10.1016/j.jvoice.2014... ⁾ compared to patients with NVQV.

Shimmer is a measure related to the variation in amplitude between adjacent cycles and is thus related to vibratory irregularity and glottic resistance⁽⁴4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ^,²⁹29 Madazio G, Leão S, Behlau M. The phonatory deviation diagram: a novel objective measurement of vocal function. Folia Phoniatr Logop. 2011;63(6):305-11. http://dx.doi.org/10.1159/000327027. PMid:21625144.
http://dx.doi.org/10.1159/000327027 ... ⁾. On the auditory-perceptual plane, previous studies have shown that shimmer is related to roughness⁽¹⁷17 Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
http://dx.doi.org/10.1016/j.jvoice.2005... ^,¹⁸18 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201... ⁾. The shimmer values in this study contributed to the correct identification of rough voices. It should be noted that although the shimmer values are most deviated in voices with roughness, these values are still within the normal range, given the cutoff values adopted.

The objectives of one study⁽¹⁸18 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201... ⁾ included an analysis of the discriminatory power of acoustic measures when classifying deviation intensity and differentiating predominant voice types. A total of 186 dysphonic patients participated in the study. The measures used were the fundamental frequency (F₀), jitter, shimmer and GNE. The results showed that the shimmer and GNE were useful in detecting rough and breathy voices, respectively.

Data from the aforementioned study⁽¹⁸18 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201... ⁾ appear similar to those found in the present one as shimmer was correlated with the roughness parameter and GNE in this study. Although appearing in all combinations, shimmer seemed to be more sensitive in relation to voices with a breathy quality.

The F3 and GNE measures were selected as the best combination when discriminating between NVQV and breathy voices (84.05±3.29%) and had high sensitivity (90.00±5.09%) in the correct identification of breathy voices. Patients with breathy voices had higher F3 values and lower GNE values.

The F3 frequency is related to the two cavities established by the tongue position, that is, the cavity behind the tongue constriction and the one in front of it. The F3 frequency can also be affected by adjustments to the lips, larynx and pharynx, and it has a tendency to decrease with labiodentalization adjustment and lip rounding and to increase with constriction around the pharynx⁽³3 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328... ^,¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ^,¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ^,²⁰20 Gonçalves MIR, Pontes PAL, Vieira VP, Pontes AAL, Curcio D, Biase NG. Função de transferência das vogais orais do Português brasileiro: análise acústica comparativa. Rev Bras Otorrinolaringol. 2009;75(5):680-4. ⁾. Thus, one can infer that patients with a predominantly breathy voice quality have a greater constriction around the pharynx and more stretched lips, probably as compensatory mechanisms to the increase voice intensity.

The findings of this study reinforce the fact that the GNE measure is strongly related to the breathy voice quality⁽⁴4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ^,⁵5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97... ^,¹⁸18 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201... ^,²⁸28 Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
http://dx.doi.org/10.1016/j.jvoice.2014... ⁾ and is the only isolated measure with acceptable accuracy when discriminating between NVQV and breathy signals.

When discriminating between rough and tense voices, the best combination found was the measures of the means of F₀, F3 and GNE (73.75±3.75%), and this combination had greater specificity (84.17±5.75%) when identifying rough voices. The mean F ₀ was lower in patients with roughness than in those with tense voices, F3 had higher values in patients with roughness, and the GNE was higher in patients with a tense voice quality.

The findings suggest that patients with a tense voice quality may have greater longitudinal tension in the vocal folds due to the higher mean F₀ values. Furthermore, it appears that patients with roughness have a smaller cavity in the vocal tract due to the increase in F3⁽¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ^,¹³13 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96. ⁾ and that patients with a tense voice quality seem to have less noise in the voice ⁽⁴4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ^,⁵5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97... ⁾ than patients with roughness, an aspect suggested by the fact that the GNE is less deviated in tense voices.

The rough vs. tense discrimination category appeared only when there was a combination of measures and there was no acceptable isolated value. This demonstrates the importance of finding the best combination of formantic measures to identify voice quality⁽⁴4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008... ^,²⁴24 González CMT, Hernandez JBA, Orozco-Arroyave JR, Casals JS, Gallego-Jutgla E. Automatic detection of laryngeal pathologies in running speech based on the HMM transformation of the nonlinear dynamics. Lect Notes Comput Sci. 2013;1:136-43. ⁾_.

The measures relating to the mean F₀, F1 and GNE were selected for discrimination between breathy and tense voices, with an accuracy of 75.71± 6.41% and with higher specificity (78.33±8.16%) in the correct identification of breathy voices. The F₀ and F1 values were greater in patients with tense voices, and the GNE was lower in patients with breathy voices.

Regarding the mean F₀ and tense voice quality, it is important to note that the fundamental frequency is determined, among other factors, by vocal fold tension, which is controlled by the intrinsic laryngeal muscles, specifically the cricothyroid⁽²2 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200... ^,¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ^,¹⁵15 Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137. ⁾. Thus, patients with vocal tension usually exhibit greater contraction of the extrinsic and intrinsic muscles, including greater longitudinal vocal fold tension, greater subglottic pressure and greater vocal tract constriction, generating a larger number of glottic cycles per second and hence a greater fundamental frequency⁽²⁵25 Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
http://dx.doi.org/10.1016/j.jvoice.2009... ⁾.

The general grade and roughness seem to be parameters more related to F₀ ⁽²⁸28 Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
http://dx.doi.org/10.1016/j.jvoice.2014... ^,³⁰30 Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
http://dx.doi.org/10.1016/j.jvoice.2016... ⁾. The mean F₀ values are higher both in general grade and in vocal tension, and the F₀ standard deviation values are also high in rough voices. This study's findings seem to agree in regard to the increase in F₀ in patients with tense voices and the positive relationship between F ₀ and the general grade of voice deviation.

In relation to the increased F1 values, it would seem that patients with a tense voice quality may make adjustments in the vocal tract, having a larger vertical opening of the mouth and greater pharyngeal constriction⁽⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996. ^-⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ^,¹⁰10 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200... ^,¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ⁾ than patients with a breathy voice quality.

A study⁽¹⁴14 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200... ⁾ conducted with 111 women with muscle tension dysphonia found similar results. The F1 and F2 formants were elevated in this population compared to those with healthy voices, suggesting adjustments in the supraglottis relating to a greater vertical opening of the mouth, greater pharyngeal constriction and a lower and more anterior tongue position. The adjustments found in that study are similar to those of the present study in regard to the greater vertical opening of the mouth and increased pharyngeal constriction as indicated by an increase in F1 in patients with a tense voice quality.

Analysis of the combined acoustic measures in the discrimination of the predominant voice quality again revealed that the GNE measure appeared in all acceptable combinations found. The F₀ measure was present in most of the combinations when discriminating predominant voice quality, which attests to the results found in previous studies⁽⁸8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ^,¹³13 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96. ^,¹⁸18 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201... ^,²⁹29 Madazio G, Leão S, Behlau M. The phonatory deviation diagram: a novel objective measurement of vocal function. Folia Phoniatr Logop. 2011;63(6):305-11. http://dx.doi.org/10.1159/000327027. PMid:21625144.
http://dx.doi.org/10.1159/000327027 ... ^,³⁰30 Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
http://dx.doi.org/10.1016/j.jvoice.2016... ⁾ in which the fundamental frequency appeared to be an interesting measure when discriminating voice quality. This is probably because it is related, in physiological terms, to the neuromuscular condition and vocal fold mucosa vibration regularity, and in acoustic and perceptual terms, it is directly related to the sound signal periodicity⁽⁶6 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970. ^,⁹9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017... ^,¹¹11 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14... ^,³⁰30 Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
http://dx.doi.org/10.1016/j.jvoice.2016... ⁾.

In summary, a combination of perturbation/noise measures and formantic measures promotes a slight improvement (75.24%) in the classification rate between voices with NVQV and those with mild to moderate deviation in relation to the GNE measure alone (70.95%). This combination also facilitates discrimination between voices with mild to moderate and moderate deviations, which was not observed with isolated measures. These findings offer evidence that the greater the voice deviation intensity, the more complex the signal in terms of the aperiodicity and noise. Such intensities therefore require a combination of measures to characterize them adequately.

Furthermore, a combined analysis of measures relating to the glottal source (perturbation and noise) and filter (formantic measures) contributes to a broadening of our understanding of source-filter interaction mechanisms in deviated voices and may be useful when measuring the results of treatment and monitoring during voice therapy. The fact that more formantic measures (F1 and F3) were selected by the classifier for discriminating more deviated voices shows that individuals with more intense deviations make more vocal tract adjustments, probably as a compensatory mechanism in response to the functional inefficiency of the glottal source.

In regard to the predominant voice quality, the formantic measures proved important when classifying between NVQV and breathy (F3), rough and tense (F3) and breathy and tense (F1) voices. Specifically, the formantic measures seem to provide a greater contribution to the discrimination of the auditory-perceptual parameter tension. Individuals with tense voices probably make more supraglottic adjustments, either for compensatory reasons or in cooccurrence with the alterations at the glottic level.

The presence of a voice disorder tends to change the voice signal in different ways and may combine various types of perturbation and noise in vocal emissions as well as possible supraglottic adjustments. The combined use of measures for the evaluation, characterization and classification of the voice signal may therefore better represent voice production characteristics and highlight manifestations that would not be detected with the use of isolated measures. Other studies⁽³3 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328... ^,²⁸28 Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
http://dx.doi.org/10.1016/j.jvoice.2014... ^,³⁰30 Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
http://dx.doi.org/10.1016/j.jvoice.2016... ⁾ have shown that a combination of perturbation and noise measures improves the discrimination between signals with and without voice deviations. However, in terms of this study, it may be concluded that combining measures related to vocal tract adjustments with traditional perturbation and noise measures can improve the classification of the voice deviation intensity and type and provide insights into the source-filter interaction in patients with voice deviations.

CONCLUSION

The GNE acoustic measure was the only one able to discriminate voice deviation intensity and predominant voice quality in isolation. There was a gain in the classification performance when traditional acoustic and formantic measures were combined in the discrimination of both the voice deviation intensity and predominant voice quality.

Study conducted at Departamento de Fonoaudiologia, Universidade Federal da Paraíba – UFPB - João Pessoa (PB), Brasil.
Financial support: National Council for Scientific and Technological Development (Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq). Process nº 480168/2013-0.

REFERÊNCIAS

¹
Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G, et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating assessment techniques. Eur Arch Otorhinolaryngol. 2001;258(2):77-82. http://dx.doi.org/10.1007/s004050000299. PMid:11307610.
» http://dx.doi.org/10.1007/s004050000299
²
Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
» http://dx.doi.org/10.1044/1058-0360(2008/08-0017)
³
Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
» http://dx.doi.org/10.1097/MOO.0b013e32834575fe
⁴
Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
» http://dx.doi.org/10.1016/j.jvoice.2008.04.006
⁵
Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
» http://dx.doi.org/10.1016/S0892-1997(97)80043-4
⁶
Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.
⁷
Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.
⁸
Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.
⁹
Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
» http://dx.doi.org/10.1590/1982-021620171961417
¹⁰
Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
» http://dx.doi.org/10.1590/S1516-18462009005000010
¹¹
Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
» http://dx.doi.org/10.1044/2015_AJSLP-14-0063
¹²
Titze I, Palaparthi A. Sensitivity of source-filter interaction to specific vocal tract shapes. IEEE Trans Audio Speech Lang Process. 2016;24(12):2507-15. http://dx.doi.org/10.1109/TASLP.2016.2616543.
» http://dx.doi.org/10.1109/TASLP.2016.2616543
¹³
Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96.
¹⁴
Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
» http://dx.doi.org/10.1016/j.jcomdis.2008.10.001
¹⁵
Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137.
¹⁶
Schwartz SR, Cohen SM, Dailey SH, Rosenfeld RM, Deutsch ES, Gillespie MB, et al. Clinical practice guideline: hoarseness (dysphonia). Otolaryngol Head Neck Surg. 2009;141(3, Supl 2):S1-31. http://dx.doi.org/10.1016/j.otohns.2009.06.744. PMid:19729111.
» http://dx.doi.org/10.1016/j.otohns.2009.06.744
¹⁷
Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
» http://dx.doi.org/10.1016/j.jvoice.2005.04.007
¹⁸
Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
» http://dx.doi.org/10.1590/2317-1782/20142013033
¹⁹
Cohen SM, Pitman MJ, Noordzij JP, Courey M. Management of dysphonic patients by otolaryngologists. Otolaryngol Head Neck Surg. 2012;147(2):289-94. http://dx.doi.org/10.1177/0194599812440780. PMid:22368039.
» http://dx.doi.org/10.1177/0194599812440780
²⁰
Gonçalves MIR, Pontes PAL, Vieira VP, Pontes AAL, Curcio D, Biase NG. Função de transferência das vogais orais do Português brasileiro: análise acústica comparativa. Rev Bras Otorrinolaringol. 2009;75(5):680-4.
²¹
Ozkan H. A Comparison of classification methods for telediagnostics of Parkinson’s disease. Entropy. 2016;18(115):1-14.
²²
Yamasaki R, Madazio G, Leão SHS, Padovani M, Azevedo R, Behlau M. Auditory-perceptual evaluation of normal and dysphonic voices using the voice deviation scale. J Voice. 2017;31(1):67-71. http://dx.doi.org/10.1016/j.jvoice.2016.01.004. PMid:26873420.
» http://dx.doi.org/10.1016/j.jvoice.2016.01.004
²³
Hosmer DW, Lemeshow S. Applied logistic regression. New York: Willey; 2000. http://dx.doi.org/10.1002/0471722146.
» http://dx.doi.org/10.1002/0471722146
²⁴
González CMT, Hernandez JBA, Orozco-Arroyave JR, Casals JS, Gallego-Jutgla E. Automatic detection of laryngeal pathologies in running speech based on the HMM transformation of the nonlinear dynamics. Lect Notes Comput Sci. 2013;1:136-43.
²⁵
Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
» http://dx.doi.org/10.1016/j.jvoice.2009.10.009
²⁶
Macari AT, Ziade G, Turfe Z, Chidiac A, Alam E, Hamdan AL. Correlation between the position of the hyoid bone on lateral cephalographs and formant frequencies. J Voice. 2016;30(6):757.e21-6. http://dx.doi.org/10.1016/j.jvoice.2015.08.020. PMid:26604010.
» http://dx.doi.org/10.1016/j.jvoice.2015.08.020
²⁷
Samlan RA, Story BH, Bunton K. Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computacional modeling. J Speech Lang Hear Res. 2013;56(4):1209-23. http://dx.doi.org/10.1044/1092-4388(2012/12-0194). PMid:23785184.
» http://dx.doi.org/10.1044/1092-4388(2012/12-0194)
²⁸
Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
» http://dx.doi.org/10.1016/j.jvoice.2014.02.013
²⁹
Madazio G, Leão S, Behlau M. The phonatory deviation diagram: a novel objective measurement of vocal function. Folia Phoniatr Logop. 2011;63(6):305-11. http://dx.doi.org/10.1159/000327027. PMid:21625144.
» http://dx.doi.org/10.1159/000327027
³⁰
Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
» http://dx.doi.org/10.1016/j.jvoice.2016.08.015

Publication Dates

Publication in this collection
22 Oct 2018
Date of issue
2018

History

Received
08 Jan 2018
Accepted
09 Apr 2018

Este é um artigo publicado em acesso aberto (Open Access) sob a licença Creative Commons Attribution, que permite uso, distribuição e reprodução em qualquer meio, sem restrições desde que o trabalho original seja corretamente citado.

[1] Study conducted at Departamento de Fonoaudiologia, Universidade Federal da Paraíba – UFPB - João Pessoa (PB), Brasil.

[2] Financial support: National Council for Scientific and Technological Development (Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq). Process nº 480168/2013-0.

Discrimination case	Sensitivity	Specificity
NVQV × Mild to moderate	Rate of correct classification of signals with mild to moderate deviation	Rate of correct classification of signals with NVQV
Mild × Moderate	Rate of correct classification of signals with moderate deviation	Rate of correct classification of signals with mild deviation
NVQV × Breathy	Rate of correct classification of signals with breathy voice quality	Rate of correct classification of signals with NVQV
NVQV × Rough	Rate of correct classification of signals with rough voice quality	Rate of correct classification of signals with NVQV
Breathy × Tense	Rate of correct classification of signals with tense voice quality	Rate of correct classification of signals with breathy voice quality
Rough × Breathy	Rate of correct classification of signals with breathy voice quality	Rate of correct classification of signals with rough voice quality
Rough × Tense	Rate of correct classification of signals with tense voice quality	Rate of correct classification of signals with rough voice quality

Measure	VOICE DEVIATION INTENSITY
	NVQV		Mild to moderate		Moderate
	Mean	SD	Mean	SD	Mean	SD
Mean F₀	179.87	43.19	182.06	60.78	183.75	69.97
SD F₀	7.28	15.82	10.80	21.58	24.78	37.83
F1	599.35	143.10	592.95	127.39	585.63	145.74
F2	2014.08	232.43	2018.87	213.42	2033.47	231.94
F3	2812.05	216.47	2843.75	245.48	2888.86	219.74
Jitter	0.25	0.50	0.50	1.22	1.87	2.86
Shimmer	3.91	3.09	5.32	4.29	9.11	9.11
GNE	0.90	0.119	0.83	0.19	0.68	0.24

Measure	PREDOMINANT VOICE QUALITY
	NVQV		Breathy		Rough		Tense
	Mean	SD	Mean	SD	Mean	SD	Mean		SD
Mean F₀	181.02	42.62	171.62	64.52	196.75	68.82	203.83		64.81
SD F₀	7.36	15.95	18.86	33.70	13.99	22.00	19.63		35.42
F1	597.42	143.545	581.58	108.75	586.78	155.31	672.93		188.06
F2	2011.29	233.41	2005.97	229.23	2046.27	207.30	2063.18	210.52
F3	2808.08	216.124	2837.71	254.40	2911.63	202.524	2910.97	230.60
Jitter	0.25	0.509	1.44	2.46	1.02	2.75	1.42		3.27
Shimmer	3.91	3.09	8.71	7.45	6.33	5.76	8.827		11.46
GNE	0.90	0.11	0.76	0.22	0.69	0.27	0.83		0.20

Voice deviation intensity	Isolated measure	Acc (%)	Sens (%)	Spec (%)
NVQV × Mild to moderate	GNE	70.95±3.05	86.67±5.44	55.83±5.13
	Best combination
NVQV × Mild to moderate	Means of F₀, F2, and GNE	75.24±4.86	84.17±5.34	67.50±7.90
Mild to moderate × Moderate	SDs of F₀, F1, F3, Jitter, and GNE	74.02±3.26	87.62±2.51	56.14±6.28

Predominant voice quality	Isolated measure	Acc (%)	Sens (%)	Spec (%)
NVQV × Rough	GNE	73.57±5.56	88.33±4.84	59.17±11.00
NVQV × Breathy	GNE	82.38±3.73	87.50±5.16	78.33±7.88
Breathy × Tense	GNE	71.43±4.76	57.50±8.82	81.67±4.08
	Best combination
NVQV × Rough	Means of F₀, Shimmer, and GNE	78.57±4.21	87.50±5.16	70.00±6.36
NVQV × Breathy	Means of F3 and GNE	84.05±3.29	90.00±5.09	77.50±7.03
Rough × Tense	Means of F₀, F3, and GNE	73.75±3.75	60.83±6.34	84.17±5.75
Breathy ×. Tense	Means of F₀, F1, and GNE	75.71±6.41	71.67±7.05	78.33±8.16

Brasil

Brasil

Accuracy of traditional and formant acoustic measurements in the evaluation of vocal quality

ABSTRACT

Purpose

Methods

Results

Conclusion

RESUMO

Objetivo

Método

Resultados

Conclusão

INTRODUCTION

METHODS

Study design

Sample

Procedures

Data analysis

RESULTS

DISCUSSION

Traditional acoustic and formantic measures in the discrimination of voice deviation intensity

Traditional acoustic and formantic measures in the discrimination of predominant voice quality

CONCLUSION

REFERÊNCIAS

Publication Dates

History