Acessibilidade / Reportar erro

Accuracy of traditional and formant acoustic measurements in the evaluation of vocal quality

ABSTRACT

Purpose

Investigate the accuracy of isolated and combined acoustic measurements in the discrimination of voice deviation intensity (GD) and predominant voice quality (PVQ) in patients with dysphonia.

Methods

A total of 302 female patients with voice complaints participated in the study. The sustained /ɛ/ vowel was used to extract the following acoustic measures: mean and standard deviation (SD) of fundamental frequency (F0), jitter, shimmer, glottal to noise excitation (GNE) ratio and the mean of the first three formants (F1, F2, and F3). Auditory-perceptual evaluation of GD and PVQ was conducted by three speech-language pathologists who were voice specialists.

Results

In isolation, only GNE provided satisfactory performance when discriminating between GD and PVQ. Improvement in the classification of GD and PVQ was observed when the acoustic measures were combined. Mean F0, F2, and GNE (healthy × mild-to-moderate deviation), the SDs of F0, F1, and F3 (mild-to-moderate × moderate deviation), and mean jitter and GNE (moderate × intense deviation) were the best combinations for discriminating GD. The best combinations for discriminating PVQ were mean F0, shimmer, and GNE (healthy × rough), F3 and GNE (healthy × breathy), mean F 0, F3, and GNE (rough × tense), and mean F0 , F1, and GNE (breathy × tense).

Conclusion

In isolation, GNE proved to be the only acoustic parameter capable of discriminating between GG and PVQ. There was a gain in classification performance for discrimination of both GD and PVQ when traditional and formant acoustic measurements were combined.

Keywords
Voice; Accuracy; Acoustic Analysis; Vocal Quality; Voice Disorders

RESUMO

Objetivo

Investigar a acurácia das medidas acústicas, isoladas e combinadas, na discriminação da intensidade do desvio vocal (GG) e da qualidade vocal predominante (QVP) em pacientes disfônicos.

Método

Participaram 302 pacientes do gênero feminino, com queixa vocal. A partir da vogal /ɛ/ sustentada, foram extraídas as medidas acústicas de média e desvio padrão (DP) da frequência fundamental (F0), o jitter, o shimmer e o Glottal to noise excitation (GNE) e a média dos três primeiros formantes (F1, F2, F3). A avaliação perceptivo-auditiva do GG e QVP foi realizada por três fonoaudiólogos especialistas em voz.

Resultados

Isoladamente, apenas o GNE obteve desempenho satisfatório na discriminação do GG e da QVP. Houve uma melhora na classificação do GG e QVP com a combinação das medidas acústicas. A média de F0, F2 e GNE (saudável × desvio leve a moderado), DP de F0, F1 e F3 (leve a moderado × desvio moderado), Jitter e GNE (moderado × desvio intenso) foram as melhores combinações para discriminar o GG. As melhores combinações para discriminação da QVP foram média de F0, Shimmer e GNE (saudável × rugosa), F3 e GNE (saudável × soprosa), média de F0, F3 e GNE (rugosa × tensa), média de F0, F1 e GNE (soprosa × tensa).

Conclusão

De forma isolada, o GNE mostrou-se o único parâmetro acústico capaz de discriminar o GG e a QVP. Houve um ganho no desempenho da classificação com a combinação das medidas acústicas tradicionais e formânticas, tanto para a discriminação do GG como da QVP.

Descritores
Voz; Acurácia; Análise Acústica; Qualidade Vocal; Distúrbios da Voz

INTRODUCTION

The voice is essentially a multidimensional phenomenon that includes physiological, perceptual, aerodynamic, acoustic and emotional aspects. Therefore, it is necessary that voice evaluations also follow this principle and that these dimensions are considered and integrated in the process to achieve an overall view of dysphonia(11 Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G, et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating assessment techniques. Eur Arch Otorhinolaryngol. 2001;258(2):77-82. http://dx.doi.org/10.1007/s004050000299. PMid:11307610.
http://dx.doi.org/10.1007/s004050000299...
).

The goal of voice evaluation is to analyze voice quality, identify whether the voice is healthy or not, diagnose the presence of a perturbation, determine a prognosis, and monitor the patient's progress during voice therapy(22 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200...
). The process of voice evaluation generally includes procedures relating to a visual laryngeal examination, auditory-perceptual voice evaluation, acoustic analysis, aerodynamic evaluation and voice self-evaluation(11 Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G, et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating assessment techniques. Eur Arch Otorhinolaryngol. 2001;258(2):77-82. http://dx.doi.org/10.1007/s004050000299. PMid:11307610.
http://dx.doi.org/10.1007/s004050000299...
).

Auditory-perceptual analysis is considered the primary reference standard used by the speech therapist when performing voice evaluations(22 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200...
). It is considered a subjective method, as it depends on the evaluator's judgment and has an exclusively impressionistic nature(22 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200...
,33 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328...
). This type of evaluation provides information about the characterization of voice deviation intensity, as well as the predominant voice quality(44 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
).

Acoustic analysis is a more objective procedure. It is noninvasive and is becoming increasingly used in the voice clinic. In traditional acoustic analysis, two types of measure are used, perturbation measures (jitter and shimmer) and noise measures. Jitter indicates the variability of the fundamental frequency in the short term and is measured between neighboring glottal cycles. Shimmer corresponds to variability in the sound wave amplitude over the short term. Glottal-to-noise excitation (GNE) measures the additional noise in the sound signal, irrespective of the noise modulated by the glottal mechanism, indicating the source of the voice signal and whether it comes from vocal fold vibration or from turbulent airflow generated in the vocal tract. Measures of the perturbation and noise are therefore focused on the glottal source(33 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328...

4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
-55 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97...
).

In addition to these measures, some measures are related to the resonance of the sound wave in the vocal tract, which changes according to the different configurations of the vocal tract structure positioning and volume of the resonance cavities during voice production. Such measures are called formants and correspond to energy concentrations along the vocal tract(33 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328...

4 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...

5 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97...
-66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970. ).

The vocal tract has a three-dimensional configuration and the sound that is produced in the glottis is modified by the positioning of structures such as the larynx, soft palate, tongue, lips and jaw. The frequencies of the glottal signal that are reinforced by the supraglottic vocal tract are called formants, and their analysis provides information about adjustments being made in the supraglottic vocal tract(66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.

8 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.

9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017...
-1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
).

Adjustments in the positioning of the articulators and in the volume of the resonance cavities determine the values ​​of formants(66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.
-88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ,1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
). Thus, an increase in the first formant (F1), for example, is related to a downward jaw adjustment, anterior lowering of the tongue and pharyngeal narrowing. An anterior adjustment of the tongue which is then lowered generates an increase in the second formant (F2). The formation of a smaller cavity immediately behind the incisors can raise the value of the third formant (F3) (66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.
-88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ,1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
,1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
).

In this context, there is a strong interaction between the source producing the sound (glottis) and the filter. The feedback from pressure encountered by the sound wave in the vocal tract modifies the glottal airflow and vocal fold vibration mode(1212 Titze I, Palaparthi A. Sensitivity of source-filter interaction to specific vocal tract shapes. IEEE Trans Audio Speech Lang Process. 2016;24(12):2507-15. http://dx.doi.org/10.1109/TASLP.2016.2616543.
http://dx.doi.org/10.1109/TASLP.2016.26...
).

Some studies(88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.

9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017...
-1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
,1313 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96.

14 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200...
-1515 Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137. ) have observed that patients with a voice disorder make adjustments not just in the glottis but also in the supraglottis. These patients have lower formant values ​​(F1, F2, F3) than individuals without a voice disorder(1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
,1313 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96. ,1515 Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137. ).

Thus, these adjustments may be related to the development or maintenance of, or may cooccur with, voice disorders(1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
,1313 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96. ). Such adjustments are not necessarily evaluated by traditional acoustic measures, as they focus on the glottal source(1616 Schwartz SR, Cohen SM, Dailey SH, Rosenfeld RM, Deutsch ES, Gillespie MB, et al. Clinical practice guideline: hoarseness (dysphonia). Otolaryngol Head Neck Surg. 2009;141(3, Supl 2):S1-31. http://dx.doi.org/10.1016/j.otohns.2009.06.744. PMid:19729111.
http://dx.doi.org/10.1016/j.otohns.2009...
).

Notably, acoustic analysis does not replace auditory-perceptual analysis but rather integrates the auditory and physiological levels(66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.
-88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ). A combination of acoustic and perceptual auditory measures increases the accuracy in determining the presence or absence of a voice disorder and the intensity of the deviation present(1717 Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
http://dx.doi.org/10.1016/j.jvoice.2005...
,1818 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201...
).

For this reason, it is important to investigate whether a combination of measures relating to the source (perturbation and noise) and filter (formantic measures) allows a better classification of voice signals in regard to deviation intensity and predominant voice quality.

This study therefore aims to investigate the accuracy of both isolated and combined traditional acoustic and formantic measures in the discrimination of the voice deviation intensity and predominant voice quality in dysphonic patients. To carry out this study, we start from the hypothesis that a combination of traditional acoustic and formantic measures will improve the discrimination of voice deviation intensity and that a combination of traditional acoustic and formantic measures can improve the discrimination between different predominant voice qualities.

METHODS

Study design

This was a descriptive, cross-sectional, observational study, evaluated and approved by the Ethics Committee of the Health Sciences Center, Federal University of Paraíba (UFPB), under protocol number 52492/12. All participants signed a free and informed consent form authorizing the study.

Sample

Patients treated at the Department of Speech Therapy's Voice Laboratory (UFPB) in the period between April 2012 and July 2015 participated in this study. The following eligibility criteria were considered for participation:

  • Being female, given the relationship between this variable and the mean F0 measure, which is associated with the anatomical characteristics of the vocal folds, which are unequal between adult males and females(1616 Schwartz SR, Cohen SM, Dailey SH, Rosenfeld RM, Deutsch ES, Gillespie MB, et al. Clinical practice guideline: hoarseness (dysphonia). Otolaryngol Head Neck Surg. 2009;141(3, Supl 2):S1-31. http://dx.doi.org/10.1016/j.otohns.2009.06.744. PMid:19729111.
    http://dx.doi.org/10.1016/j.otohns.2009...
    ). Furthermore, there is a higher prevalence of voice disorders in this population (1919 Cohen SM, Pitman MJ, Noordzij JP, Courey M. Management of dysphonic patients by otolaryngologists. Otolaryngol Head Neck Surg. 2012;147(2):289-94. http://dx.doi.org/10.1177/0194599812440780. PMid:22368039.
    http://dx.doi.org/10.1177/0194599812440...
    );

  • Being over 18 and below 65 years of age, thus avoiding the periods of voice change and presbyphonia, respectively;

  • Presenting a voice complaint, answering positively to the following question: “Do you consider that you have a voice problem now or have had one during the past six months?”;

  • Having undergone a laryngeal visual examination and having an otorhinolaryngological report.

Of the total of 530 patients evaluated in the laboratory, 96 were male, 75 were under 18 or over 65 years of age and 57 individuals had no voice complaints. Thus, 228 individuals were excluded because they did not meet the eligibility criteria, leaving a final sample of 302 patients with a mean age of 39.25(±12.63) years. No patient had neurological or cognitive impairments that prevented voice recording.

All sample patients presented a laryngeal report at the time of data collection, as described below: 78 (25.85%) patients had vocal nodules, 63 (20.86%) had no structural or functional changes in the larynx, 41 (13.57%) had vocal cysts, 35 (11.60%) had hyperemia secondary to laryngopharyngeal reflux, 24 (7.94%) had a middle-posterior triangular gap, 24 (7.94%) had vocal fold polyps, 18 (5.96%) had unilateral vocal fold paralysis, 11 (3.64%) had a vocal sulcus and 8 (2.64) had Reinke’s edema.

Procedures

All data collection for this study was conducted in the Department of Speech Therapy's Voice Laboratory (UFPB) during the initial voice evaluation session. During this session, the patients were evaluated by means of a form containing questions relating to personal information and voice complaints. They completed voice self-evaluation questionnaires and underwent the recording of speech tasks.

Only the personal identification, voice complaint and sustained vowel sample data were used for this study, as described later.

The voices were collected in a recording booth with soundproofing and a noise level below 50 dB SPL, with a 44000-Hz sampling rate at 16 bits per sample and a 10-cm distance between the microphone and the patient's mouth. Fonoview software, version 4.5, CTS Informática was used on a Dell all-in-one desktop, with a Senheiser E-835 unidirectional cardioid microphone located on a pedestal and coupled to a U-Phoria UMC 204 Behringer preamplifier.

For the voice recording collection, the patient remained standing facing the pedestal at the recommended distance between the mouth and microphone. The patient received instructions about the voice collection, and recording began soon after. During the recording, the patient was asked to emit the sustained /Ɛ/ vowel at a frequency and intensity self-reported as comfortable and normal. The /Ɛ/ vowel was selected for this study because it is an oral, open vowel, is not round and is considered to be the vowel with the most average position in Brazilian Portuguese, which facilitates a more neutral and intermediate position of the vocal tract. In addition, it is the most commonly used vowel for evaluating voice quality in Brazil(2020 Gonçalves MIR, Pontes PAL, Vieira VP, Pontes AAL, Curcio D, Biase NG. Função de transferência das vogais orais do Português brasileiro: análise acústica comparativa. Rev Bras Otorrinolaringol. 2009;75(5):680-4. ).

Subsequently, the voices were edited using SoundForge software, version 10.0. The first and final two seconds of the vowel emission were removed due to the greater irregularity in these sections, with a minimum time of three seconds being retained for each emission. The signals were normalized for the auditory-perceptual evaluation, using SoundForge's “normalize” control in peak level mode, to standardize the audio output at between -6 and 6 dB.

The acoustic measures of the fundamental frequency (mean and standard deviation), jitter, shimmer and glottal-to-noise excitation (GNE) were extracted manually using the voice quality analysis module of VoxMetria software, version 4.7h (CTS Informática, Pato Branco, Paraná, Brazil). The reference values in that software ​​for the jitter, shimmer and GNE parameters are 0.6, 6.5 and 0.5%, respectively. Values ​​greater than those cited for the jitter and shimmer are considered deviated, while values ​​lower than that cited for the GNE may be considered deviated.

Praat software, version 5.3.77h, was used to extract the formantic measures from the vowel’s representation in a broadband spectrogram containing the first three formants (F1, F2, and F3). Due to the large number of estimations involved, a script was used (a tool that automatically extracts, in a standardized manner, the parametric measures investigated), which facilitated the optimization of processing time and avoided possible handling errors during the estimation procedures. The means and standard deviations of the formant frequencies were extracted for each sample. All values ​​were then checked, and no outliers were identified.

The auditory-perceptual evaluation was performed independently by three speech therapists who were voice specialists with over 10 years of experience in this type of analysis. A visual analogue scale (VAS) ranging from 0 to 100 mm was used(2121 Ozkan H. A Comparison of classification methods for telediagnostics of Parkinson’s disease. Entropy. 2016;18(115):1-14. ) to evaluate the voice deviation intensity (general grade [GG]), of the sustained vowel. A score closer to 0 represents a lower voice deviation, and one closer to 100 a greater voice deviation.

Before the auditory-perceptual evaluation, eight sustained /Ɛ/ vowel anchor stimuli were used for the training of the judges. These contained two samples of individuals with normal voice quality variability (NVQV), two samples of individuals with mild to moderate voice deviations, two samples of individuals with moderate voice deviations and two samples of individuals with intense voice deviations. All the files presented contained female voices. The judges were asked to listen to the anchor stimuli immediately prior to analyzing the voices for this study. All samples selected for this training were previously analyzed by speech therapists with experience in voice analysis and were routinely used for perceptual auditory training and as anchor stimuli in the laboratory where this study was conducted.

The perceptual evaluation session took place in a silent environment. First, each judge was told that the voices should be considered as having NVQV when they were socially acceptable, produced naturally, and without effort, noise or unstable conditions during emission. They were also instructed that roughness would correspond to the presence of vibratory irregularities, breathiness would be related to the audible escape of air during the emission and tension would correspond to the perception of vocal effort during the emission.

The auditory-perceptual parameters of roughness, breathiness and tension were chosen to characterize the signals in this study because they are universally used to characterize voice quality deviation(22 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200...
) and because they have known correlates on the physiological and acoustic planes.

For the evaluation, each sustained vowel emission was presented three times through a speaker at a comfortable intensity as self-reported by the evaluators. The judges then identified the presence or absence of voice deviation, the predominant voice quality in the deviated voices (rough, breathy or tense) and, finally, made a judgment as to the voice deviation intensity.

The VAS was subsequently converted into a numerical scale, with values ​​from 1 to 4, wherein grade 1 represented individuals with NVQV (0-35.5 mm), grade 2 represented subjects with mild to moderate deviation (35.6 to 50.5 mm), grade 3 represented a moderate deviation (50.6 to 90.5 mm) and grade 4 represented an intense deviation (> 90.5 mm) (2222 Yamasaki R, Madazio G, Leão SHS, Padovani M, Azevedo R, Behlau M. Auditory-perceptual evaluation of normal and dysphonic voices using the voice deviation scale. J Voice. 2017;31(1):67-71. http://dx.doi.org/10.1016/j.jvoice.2016.01.004. PMid:26873420.
http://dx.doi.org/10.1016/j.jvoice.2016...
).

At the end of the auditory-perceptual evaluation, 10% of the samples were randomly repeated to evaluate the reliability of the judges' analysis using Cohen's kappa coefficient. The auditory-perceptual analysis results of the judge with the greatest reliability (kappa coefficient of 0.79) were selected for use in this study. The other two judges had kappa values of ​​<0.70.

The patients were categorized into two groups according to the auditory-perceptual analysis results as follows: 33 patients with NVQV (GG≤35.5 mm) and 269 patients with voice quality deviations (GG≥35.6 mm). Of the patients with voice quality deviations, 150 were classified as mild to moderate (35.6≤GG≤50.5 mm), 112 as moderate (50.6≤GG≤90.5 mm) and 7 as having an intense deviation (GG> 90.5 mm). Of the 269 patients with voice quality deviations, 135 (50.18%) had a predominantly rough voice quality, 95 (35.31%) had a predominantly breathy voice quality and 39 (14.49%) had a predominantly tense voice quality.

The otorhinolaryngological reports of the 33 NVQV patients showed voice complaints and a lack of structural and functional laryngeal changes. Of the 269 patients with voice quality deviations, all had voice complaints; 30 had a medical diagnosis of an absence of structural and functional laryngeal changes, and 239 were diagnosed with laryngeal changes, as described above.

This sample characterization is consistent with the literature, as there is no direct relationship between the presence of a voice complaint, the presence of voice quality deviation and the presence of laryngeal changes(55 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97...
). Therefore, given that the purpose of this study was not to evaluate the acoustic parameters according to the presence or not of a speech disorder but to clarify the relationship between auditory-perceptual parameters and acoustic measures in evaluating the intensity and type of voice deviation, we decided not to exclude individuals with voice complaints but no laryngeal changes. These criteria strengthen the internal validity of the study and ensure that the independent variable (auditory-perceptual evaluation) is the only or most likely explanation for the effect on the dependent variable (acoustic parameters).

Data analysis

Descriptive statistical analyses were performed for all variables, including the mean and standard deviation values. Quadratic discriminant analysis (QDA) was performed to classify the signals as a function of the GG and predominant voice quality, with K -fold cross-validation used as an auxiliary method.

QDA was selected for this study because it allows identifying individual and combined variables that best discriminate between pre-established groups (GG and predominant voice quality). Eight acoustic measures were analyzed in the combined measure analysis and were combined 2 by 2, 3 by 3, 4 by 4, up to 8 by 8.

In the K-fold cross validation method, the classification was performed ten times, varying the data set, which is used for training and testing without repetition, so that more accurate results can be obtained(2222 Yamasaki R, Madazio G, Leão SHS, Padovani M, Azevedo R, Behlau M. Auditory-perceptual evaluation of normal and dysphonic voices using the voice deviation scale. J Voice. 2017;31(1):67-71. http://dx.doi.org/10.1016/j.jvoice.2016.01.004. PMid:26873420.
http://dx.doi.org/10.1016/j.jvoice.2016...
). Thus, signals with different GGs and predominant voice qualities were randomly divided into subsets, with a minimum of 10 signals in each subset, as this minimum number of signals facilitates the best error estimates. Signals with strong deviations were excluded from the analysis because they did not satisfy the condition of having a minimum of 10 signals.

These subsets were compared by the means of the cross-validation procedure, and for each iteration between subsets, performance measures (accuracy, sensitivity and specificity) were obtained for the classifier when discriminating the GG or predominant voice quality. At the end of all subset iterations, the mean and standard deviation values of the formed subsets were extracted and used to interpret the final classifier data.

Accuracy, sensitivity and specificity measures were used to evaluate the classifier's performance. In general, the interpretation of the sensitivity and specificity measures is most evident when the groups being compared belong to a healthy (no changes) or pathologic (with changes) class(2323 Hosmer DW, Lemeshow S. Applied logistic regression. New York: Willey; 2000. http://dx.doi.org/10.1002/0471722146.
http://dx.doi.org/10.1002/0471722146 ...
). Therefore, when performing discriminant analysis between classes with changes, such as performed in this study (when different deviation and predominant voice quality intensities were compared), it is necessary to determine in the classifier used the signal group that will have its correct classification measured by the sensitivity and the group that will have its correct classification measured by the specificity.

Therefore, a standard procedure was adopted in which the first condition presented in each table would correspond to the signal that would be classified correctly by the specificity, while the second condition would be classified correctly by the sensitivity ( Box 1 ).

Chart 1
Discrimination cases and their respective sensitivity and specificity measures

The classification performance took into account signals with different GGs and different predominant voice qualities. The individual power of each of the considered acoustic measures and possible combinations of these measures were also considered, identifying those that provided the best classification rates between voice signals under the conditions established in this study.

Considering that the accuracy can be classified as excellent (> 90%) good (80%-90%), acceptable (70%-80%), poor (60%-70%) or with no acceptable discrimination ability (<60%) (2323 Hosmer DW, Lemeshow S. Applied logistic regression. New York: Willey; 2000. http://dx.doi.org/10.1002/0471722146.
http://dx.doi.org/10.1002/0471722146 ...
), only classifications with a performance of over 70% were analyzed. Discriminant analysis (accuracy, sensitivity and specificity) was performed using MATLAB® software, version 7.9.

RESULTS

Tables 1 and 2 show the means and standard deviations of the acoustic measures as a function of GG and predominant voice quality, respectively. These data will not be examined separately but in conjunction with the performance of the classifications used.

Table 1
Means and standard deviations of acoustic measures at different voice deviation intensities
Table 2
Means and standard deviations of acoustic measures according to predominant voice quality

First, the accuracy of the isolated acoustic measures in discriminating the GG in the patients was tested. The GNE measure had the best performance (70.95%, SD = 3.05), achieving a sensitivity of 86.67±5.44% and specificity of 55.83±5.13% ( Table 3 ).

Table 3
Accuracy, sensitivity and specificity of the best isolated acoustic measures and best acoustic measure combinations in the discrimination of voice deviation intensity

When investigating the discriminatory power of the combined acoustic measures in the classification of GG in the investigated sample, the greatest accuracy was found in the following combinations: the means of F0, F2 and GNE (75.24±4.86%) when distinguishing between NVQV and mild to moderate deviations; and the SDs of F0, F1, F3, jitter and GNE (74.02±3.26%) when discriminating between mild and moderate deviations ( Table 3 ).

The accuracy of the isolated measures in the discrimination of predominant voice quality was analyzed next. GNE performed best in discriminating between NVQV and rough (73.57%±5.56), between NVQV and breathy (82.38±3.73%) and between breathy and tense (71.43%±4.76) ( Table 4 ).

Table 4
Accuracy, sensitivity and specificity of the best isolated acoustic measures and best acoustic measure combinations in the discrimination of predominant voice quality

Finally, the performance of the combined acoustic measures in the discrimination of the voice quality was tested. The means of F0, shimmer and GNE (78.57±4.21%) were the best combination when discriminating between NVQV and rough voice quality. The means of F3 and GNE (84.05±3.29%) were the best combination for distinguishing between NVQV and breathy voice quality. The means of F0, F3, and GNE (73.75%±3.75) were selected as the best combination for discriminating between rough and tense voices. The combination of the means of F0, F1 and GNE (75.71±6.41%) offered the best performance when discriminating between breathy and tense voices ( Table 4 ).

DISCUSSION

This study investigated the accuracy of both isolated and combined traditional acoustic and formantic measures in the discrimination of GG and predominant voice quality in dysphonic patients. Two hypotheses were raised as follows: 1) the combination of traditional acoustic and formantic measures improves the discrimination of GG in voices, and 2) the combination of traditional acoustic and formantic measures improves the discrimination of different predominant voice qualities. Thus, the discussion section was organized to elucidate the conclusions reached with regard to these hypotheses.

Traditional acoustic and formantic measures in the discrimination of voice deviation intensity

When analyzing the isolated acoustic measures, only GNE showed acceptable performance (70.95±3.05%) in the discrimination between NVQV voices and voices with mild to moderate deviations, with higher sensitivity (86.67%±5.44) in the correct identification of signals with deviation.

The GNE measure appeared to be lower in patients with mild to moderate deviation than in individuals with NVQV. However, this measure did not produce values in either of the two groups that were ​​below the 0.5% cut-off point considered for the presence of deviation in this parameter. In turn, in the comparative analysis, it could be inferred that patients with mild to moderate voice deviation had more silent airflow between the vocal folds than those with NVQV(55 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97...
,1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
).

A study(44 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
) conducted with 226 patients, 53 healthy controls and 173 patients with voice deviations demonstrated that GNE showed excellent accuracy (95%) when differentiating between healthy voices and those with deviations. Thus, it may be inferred that GNE is a good voice evaluation measure because it shows greater discrimination between healthy and deviated voices.

Based on the analysis of the combined acoustic measures, the hypothesis that a combination of traditional and formantic measures would improve the performance of the classifier in the discrimination of GG was confirmed. In addition to increasing the accuracy and specificity values, the combination of measures was able to discriminate between mild to moderate and moderate deviations, which the isolated measures could not. The combination of measures relating to the means of F0, F2 and GNE obtained an accuracy of 75.24%± 4.86% when discriminating between signals with NVQV and those with mild to moderate deviations. Patients with mild to moderate deviations had lower ​​GNE values and greater mean ​​F0 and F2 values than did patients with NVQV.

Lower GNE values may indicate inefficient glottal closure, more additive noise in the voice and a possible decrease in intensity(44 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
,55 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97...
,2424 González CMT, Hernandez JBA, Orozco-Arroyave JR, Casals JS, Gallego-Jutgla E. Automatic detection of laryngeal pathologies in running speech based on the HMM transformation of the nonlinear dynamics. Lect Notes Comput Sci. 2013;1:136-43. ). In turn, data in the present study in regard to GNE were analyzed comparatively between groups as no values ​​were below the cutoff in either group of signals.

The mean ​​F0 values found were linked to the presence of longitudinal vocal fold tension, which causes a greater number of glottic cycles per second, resulting in a greater F0 elevation(2525 Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
http://dx.doi.org/10.1016/j.jvoice.2009...
).

Increased F2 values ​​are related to adjustments in the tongue anteriorization (66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.
-88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ). Such adjustments promote the elevation of the laryngeal complex, and by means of a biomechanical action, there is a greater longitudinal tension in the vocal folds, with a consequent rise in F0, increased vocal effort and decreased voice projection (1414 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200...
,2525 Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
http://dx.doi.org/10.1016/j.jvoice.2009...
).

A study(2626 Macari AT, Ziade G, Turfe Z, Chidiac A, Alam E, Hamdan AL. Correlation between the position of the hyoid bone on lateral cephalographs and formant frequencies. J Voice. 2016;30(6):757.e21-6. http://dx.doi.org/10.1016/j.jvoice.2015.08.020. PMid:26604010.
http://dx.doi.org/10.1016/j.jvoice.2015...
) analyzed the formantic measures of sustained vowels and found an increase in the values ​​of these measures when the laryngeal complex was elevated. Furthermore, ​​F0 values decreased when the vocal tract length increased (low larynx) and similarly increased when the vocal tract length decreased (high larynx).

It may be inferred from these findings that compared to individuals without voice quality deviation, patients with a mild to moderate degree of deviation may implement supraglottic adjustments to compensate for dysfunctional glottic conditions with the presence of increased silent airflow. These findings are consistent with other studies(88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.

9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017...
-1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
,1313 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96.

14 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200...
-1515 Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137. ) showing that dysphonic patients tend to make adjustments in the vocal tract to compensate for their voice problem.

Nonetheless, one can question whether the supraglottic adjustment may be related to the source of the voice problem in these patients as the elevation of the larynx with increased longitudinal vocal fold tension reduces the convexity of the curvature of the free edge of the vocal folds, which is one of the mechanisms responsible for increased transglottic silent airflow (2727 Samlan RA, Story BH, Bunton K. Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computacional modeling. J Speech Lang Hear Res. 2013;56(4):1209-23. http://dx.doi.org/10.1044/1092-4388(2012/12-0194). PMid:23785184.
http://dx.doi.org/10.1044/1092-4388(201...
).

In general, the description and analysis of the formantic measures in the group with mild to moderate deviations seems to be interesting for understanding the supraglottic adjustments made by these patients, which may have implications for clinical evolution in voice therapy.

The measure combination of the SD of F0, F1, F3, jitter and GNE also had an acceptable performance (74.02±3.26%) when discriminating between signals with mild to moderate deviation and those with moderate deviation. The measures of the SD of F0, F1, F3 and jitter were higher in patients with moderate deviation, while the​ GNE values were lower in these patients than in individuals with mild to moderate deviation. In regard to the reference values ​​for the GNE and jitter measures, only the latter produced values ​​above the cutoff point for being considered deviated.

In physiological terms, the SD of F0 is directly related to the neuromuscular condition and vocal fold mucosa vibration regularity; thus, higher F0 SD values, as found in patients with moderate deviations, may indicate phonatory instability and greater vocal fold vibration irregularity, thereby causing deviations in voice production (2424 González CMT, Hernandez JBA, Orozco-Arroyave JR, Casals JS, Gallego-Jutgla E. Automatic detection of laryngeal pathologies in running speech based on the HMM transformation of the nonlinear dynamics. Lect Notes Comput Sci. 2013;1:136-43. ,2525 Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
http://dx.doi.org/10.1016/j.jvoice.2009...
).

Jitter evaluates perturbations in the frequency of the neighboring vibration cycles (1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
,1818 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201...
) and is the measure most correlated with GG(1717 Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
http://dx.doi.org/10.1016/j.jvoice.2005...
) and sensitive to the presence of voice deviations. This explains its increase in individuals with moderate voice deviations in this study.

These data suggest that patients with moderate voice deviations have more irregular vocal fold vibrations and phonatory instability (increased SD of F0), greater silent airflow, more noise in the voice (decreased GNE) and a greater overall intensity of voice deviation (increased jitter) than do patients with mild to moderate deviations.

The increase in F1 values ​​is related to the greater lowering of the oromandibular complex and to oropharyngeal narrowing(66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.
-88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ,1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
,1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
). These cited supraglottic adjustments may occur as a compensation for dysfunctional glottic conditions, as a greater degree of jaw opening and pharyngeal narrowing may cause a decrease in auditorily perceived breathiness(2727 Samlan RA, Story BH, Bunton K. Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computacional modeling. J Speech Lang Hear Res. 2013;56(4):1209-23. http://dx.doi.org/10.1044/1092-4388(2012/12-0194). PMid:23785184.
http://dx.doi.org/10.1044/1092-4388(201...
) and increased voice intensity(88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.

9 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017...
-1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
,1717 Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
http://dx.doi.org/10.1016/j.jvoice.2005...
). An increase in F1 is also associated with the phonatory effort present in dysphonic patients with muscular tension(1414 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200...
).

The hypothesis that a combination of traditional acoustic and formantic measures can improve discrimination in regard to GG was confirmed. The information seems to have a complementary nature, as formantic measures alone did not show acceptable discriminatory performance in the cases studied. Notably, in this study, an auditory-perceptual rating scale focused on the glottal source was used; therefore, one would expect a greater contribution from acoustic measures related to the glottal source.

However, more deviated voices seem to make greater supraglottic adjustments, as the higher values ​​found in the combination of measures would be related to sensitivity, i.e., indicate the most deviated signals correctly.

Traditional acoustic and formantic measures in the discrimination of predominant voice quality

When analyzing acoustic measures alone, only GNE had an acceptable performance when discriminating between voices in terms of predominant voice quality.

In regard to the discrimination between NVQV and rough voices, an accuracy of 73.57±5.56% was found, with greater sensitivity (88.33±4.84%) ​​for the correct identification of rough voices. Regarding the NVQV vs. breathy discrimination, an accuracy of 82.38±3.73% was found, with greater sensitivity (87.50±5.16%) in the correct identification of breathy voices. In the breathy vs. tense discrimination, an accuracy of 71.43±4.76% was found, with greater specificity (81.67±4.08%) in the correct identification of breathy voices.

Once again, in an isolated form, only the GNE measure showed acceptable values ​​in the discrimination of the different voice qualities. In this context, GNE proved especially important in differentiating breathy voices from other voice types. This finding is probably because GNE is directly related to the source of the voice signal, i.e., whether it comes from vocal fold vibration or turbulent airflow generated in the vocal tract(44 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
,55 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97...
). This factor could explain the direct relationship with this parameter.

The hypothesis that a combination of traditional acoustic and formantic measures can improve the discrimination of predominant voice quality was confirmed, as the combination of these measures improved the performance of the classifier when discriminating between NVQV and rough, NVQV and breathy and breathy and tense voices. It also provided acceptable discrimination between rough and tense voices.

When discriminating between NVQV and rough voices, the best combination found was the measures of the means of F0, shimmer and GNE. This combination had an accuracy of 78.57±4.21% and greater sensitivity (87.50±5.16%) in the correct identification of rough voices. The ​ means of F0 and shimmer values were higher in patients with a rough voice quality, while the GNE values were reduced in relation to VNQN voices.

In general, it is expected that rough voices will have lower F0 values(1818 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201...
). However, the increase of this measure in this study may be explained by the fact that patients with rough voices possibly had tension associated with emission and that, therefore, there was an increase in F0(22 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200...
,1414 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200...
,2828 Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
http://dx.doi.org/10.1016/j.jvoice.2014...
) compared to patients with NVQV.

Shimmer is a measure related to the variation in amplitude between adjacent cycles and is thus related to vibratory irregularity and glottic resistance(44 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
,2929 Madazio G, Leão S, Behlau M. The phonatory deviation diagram: a novel objective measurement of vocal function. Folia Phoniatr Logop. 2011;63(6):305-11. http://dx.doi.org/10.1159/000327027. PMid:21625144.
http://dx.doi.org/10.1159/000327027 ...
). On the auditory-perceptual plane, previous studies have shown that shimmer is related to roughness(1717 Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
http://dx.doi.org/10.1016/j.jvoice.2005...
,1818 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201...
). The ​​shimmer values in this study contributed to the correct identification of rough voices. It should be noted that although the ​shimmer values are most deviated in voices with roughness, these values ​​are still within the normal range, given the cutoff values ​​adopted.

The objectives of one study(1818 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201...
) included an analysis of the discriminatory power of acoustic measures when classifying deviation intensity and differentiating predominant voice types. A total of 186 dysphonic patients participated in the study. The measures used were the fundamental frequency (F0), jitter, shimmer and GNE. The results showed that the shimmer and GNE were useful in detecting rough and breathy voices, respectively.

Data from the aforementioned study(1818 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201...
) appear similar to those found in the present one as shimmer was correlated with the roughness parameter and GNE in this study. Although appearing in all combinations, shimmer seemed to be more sensitive in relation to voices with a breathy quality.

The F3 and GNE measures were selected as the best combination when discriminating between NVQV and breathy voices (84.05±3.29%) and had high sensitivity (90.00±5.09%) in the correct identification of breathy voices. Patients with breathy voices had higher F3 values ​and lower GNE values​.

The F3 frequency is related to the two cavities established by the tongue position, that is, the cavity behind the tongue constriction and the one in front of it. The F3 frequency can also be affected by adjustments to the lips, larynx and pharynx, and it has a tendency to decrease with labiodentalization adjustment and lip rounding and to increase with constriction around the pharynx(33 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328...
,1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
,1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
,2020 Gonçalves MIR, Pontes PAL, Vieira VP, Pontes AAL, Curcio D, Biase NG. Função de transferência das vogais orais do Português brasileiro: análise acústica comparativa. Rev Bras Otorrinolaringol. 2009;75(5):680-4. ). Thus, one can infer that patients with a predominantly breathy voice quality have a greater constriction around the pharynx and more stretched lips, probably as compensatory mechanisms to the increase voice intensity.

The findings of this study reinforce the fact that the GNE measure is strongly related to the breathy voice quality(44 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
,55 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97...
,1818 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201...
,2828 Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
http://dx.doi.org/10.1016/j.jvoice.2014...
) and is the only isolated measure with acceptable accuracy when discriminating between NVQV and breathy signals.

When discriminating between rough and tense voices, the best combination found was the measures of the means of F0, F3 and GNE (73.75±3.75%), and this combination had greater specificity (84.17±5.75%) when identifying rough voices. The mean F 0 was lower in patients with roughness than in those with tense voices, F3 had higher values ​​in patients with roughness, and the GNE was higher in patients with a tense voice quality.

The findings suggest that patients with a tense voice quality may have greater longitudinal tension in the vocal folds due to the higher mean F0 values. Furthermore, it appears that patients with roughness have a smaller cavity in the vocal tract due to the increase in F3(1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
,1313 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96. ) and that patients with a tense voice quality seem to have less noise in the voice (44 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
,55 Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
http://dx.doi.org/10.1016/S0892-1997(97...
) than patients with roughness, an aspect suggested by the fact that the GNE is less deviated in tense voices.

The rough vs. tense discrimination category appeared only when there was a combination of measures and there was no acceptable isolated value. This demonstrates the importance of finding the best combination of formantic measures to identify voice quality(44 Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
http://dx.doi.org/10.1016/j.jvoice.2008...
,2424 González CMT, Hernandez JBA, Orozco-Arroyave JR, Casals JS, Gallego-Jutgla E. Automatic detection of laryngeal pathologies in running speech based on the HMM transformation of the nonlinear dynamics. Lect Notes Comput Sci. 2013;1:136-43. ).

The measures relating to the mean F0, F1 and GNE were selected for discrimination between breathy and tense voices, with an accuracy of 75.71± 6.41% and with higher specificity (78.33±8.16%) in the correct identification of breathy voices. The F0 and F1 values ​​were greater in patients with tense voices, and the GNE was lower in patients with breathy voices.

Regarding the mean F0 and tense voice quality, it is important to note that the fundamental frequency is determined, among other factors, by vocal fold tension, which is controlled by the intrinsic laryngeal muscles, specifically the cricothyroid(22 Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
http://dx.doi.org/10.1044/1058-0360(200...
,1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
,1515 Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137. ). Thus, patients with vocal tension usually exhibit greater contraction of the extrinsic and intrinsic muscles, including greater longitudinal vocal fold tension, greater subglottic pressure and greater vocal tract constriction, generating a larger number of glottic cycles per second and hence a greater fundamental frequency(2525 Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
http://dx.doi.org/10.1016/j.jvoice.2009...
).

The general grade and roughness seem to be parameters more related to F0 (2828 Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
http://dx.doi.org/10.1016/j.jvoice.2014...
,3030 Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
http://dx.doi.org/10.1016/j.jvoice.2016...
). The mean ​​F0 values are higher both in general grade and in vocal tension, and the ​​F0 standard deviation values are also high in rough voices. This study's findings seem to agree in regard to the increase in F0 in patients with tense voices and the positive relationship between F 0 and the general grade of voice deviation.

In relation to the increased F1 values, it would seem that patients with a tense voice quality may make adjustments in the vocal tract, having a larger vertical opening of the mouth and greater pharyngeal constriction(66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.

7 Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.
-88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ,1010 Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
http://dx.doi.org/10.1590/S1516-1846200...
,1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
) than patients with a breathy voice quality.

A study(1414 Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
http://dx.doi.org/10.1016/j.jcomdis.200...
) conducted with 111 women with muscle tension dysphonia found similar results. The F1 and F2 formants were elevated in this population compared to those with healthy voices, suggesting adjustments in the supraglottis relating to a greater vertical opening of the mouth, greater pharyngeal constriction and a lower and more anterior tongue position. The adjustments found in that study are similar to those of the present study in regard to the greater vertical opening of the mouth and increased pharyngeal constriction as indicated by an increase in F1 in patients with a tense voice quality.

Analysis of the combined acoustic measures in the discrimination of the predominant voice quality again revealed that the GNE measure appeared in all acceptable combinations found. The F0 measure was present in most of the combinations when discriminating predominant voice quality, which attests to the results found in previous studies(88 Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p. ,1313 Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96. ,1818 Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
http://dx.doi.org/10.1590/2317-1782/201...
,2929 Madazio G, Leão S, Behlau M. The phonatory deviation diagram: a novel objective measurement of vocal function. Folia Phoniatr Logop. 2011;63(6):305-11. http://dx.doi.org/10.1159/000327027. PMid:21625144.
http://dx.doi.org/10.1159/000327027 ...
,3030 Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
http://dx.doi.org/10.1016/j.jvoice.2016...
) in which the fundamental frequency appeared to be an interesting measure when discriminating voice quality. This is probably because it is related, in physiological terms, to the neuromuscular condition and vocal fold mucosa vibration regularity, and in acoustic and perceptual terms, it is directly related to the sound signal periodicity(66 Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970. ,99 Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
http://dx.doi.org/10.1590/1982-02162017...
,1111 Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
http://dx.doi.org/10.1044/2015_AJSLP-14...
,3030 Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
http://dx.doi.org/10.1016/j.jvoice.2016...
).

In summary, a combination of perturbation/noise measures and formantic measures promotes a slight improvement (75.24%) in the classification rate between voices with NVQV and those with mild to moderate deviation in relation to the GNE measure alone (70.95%). This combination also facilitates discrimination between voices with mild to moderate and moderate deviations, which was not observed with isolated measures. These findings offer evidence that the greater the voice deviation intensity, the more complex the signal in terms of the aperiodicity and noise. Such intensities therefore require a combination of measures to characterize them adequately.

Furthermore, a combined analysis of measures relating to the glottal source (perturbation and noise) and filter (formantic measures) contributes to a broadening of our understanding of source-filter interaction mechanisms in deviated voices and may be useful when measuring the results of treatment and monitoring during voice therapy. The fact that more formantic measures (F1 and F3) were selected by the classifier for discriminating more deviated voices shows that individuals with more intense deviations make more vocal tract adjustments, probably as a compensatory mechanism in response to the functional inefficiency of the glottal source.

In regard to the predominant voice quality, the formantic measures proved important when classifying between NVQV and breathy (F3), rough and tense (F3) and breathy and tense (F1) voices. Specifically, the formantic measures seem to provide a greater contribution to the discrimination of the auditory-perceptual parameter tension. Individuals with tense voices probably make more supraglottic adjustments, either for compensatory reasons or in cooccurrence with the alterations at the glottic level.

The presence of a voice disorder tends to change the voice signal in different ways and may combine various types of perturbation and noise in vocal emissions as well as possible supraglottic adjustments. The combined use of measures for the evaluation, characterization and classification of the voice signal may therefore better represent voice production characteristics and highlight manifestations that would not be detected with the use of isolated measures. Other studies(33 Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
http://dx.doi.org/10.1097/MOO.0b013e328...
,2828 Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
http://dx.doi.org/10.1016/j.jvoice.2014...
,3030 Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
http://dx.doi.org/10.1016/j.jvoice.2016...
) have shown that a combination of perturbation and noise measures improves the discrimination between signals with and without voice deviations. However, in terms of this study, it may be concluded that combining measures related to vocal tract adjustments with traditional perturbation and noise measures can improve the classification of the voice deviation intensity and type and provide insights into the source-filter interaction in patients with voice deviations.

CONCLUSION

The GNE acoustic measure was the only one able to discriminate voice deviation intensity and predominant voice quality in isolation. There was a gain in the classification performance when traditional acoustic and formantic measures were combined in the discrimination of both the voice deviation intensity and predominant voice quality.

  • Study conducted at Departamento de Fonoaudiologia, Universidade Federal da Paraíba – UFPB - João Pessoa (PB), Brasil.
  • Financial support: National Council for Scientific and Technological Development (Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq). Process nº 480168/2013-0.

REFERÊNCIAS

  • 1
    Dejonckere PH, Bradley P, Clemente P, Cornut G, Crevier-Buchman L, Friedrich G, et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating assessment techniques. Eur Arch Otorhinolaryngol. 2001;258(2):77-82. http://dx.doi.org/10.1007/s004050000299. PMid:11307610.
    » http://dx.doi.org/10.1007/s004050000299
  • 2
    Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. Am J Speech Lang Pathol. 2009;18(2):124-32. http://dx.doi.org/10.1044/1058-0360(2008/08-0017). PMid:18930908.
    » http://dx.doi.org/10.1044/1058-0360(2008/08-0017)
  • 3
    Brockmann-Bauser M, Drinnan MJ. Routine acoustic voice analysis: time to think again? Curr Opin Otolaryngol Head Neck Surg. 2011;19(3):165-70. http://dx.doi.org/10.1097/MOO.0b013e32834575fe. PMid:21483265.
    » http://dx.doi.org/10.1097/MOO.0b013e32834575fe
  • 4
    Godino-Llorente JI, Osma-Ruiz V, Sáenz-Lechón N, Gómez-Vilda P, Blanco-Velasco M, Cruz-Roldán F. effectiveness of the glottal to noise excitation ratio for the screening of voice disorders. J Voice. 2010;24(1):47-56. http://dx.doi.org/10.1016/j.jvoice.2008.04.006. PMid:19135854.
    » http://dx.doi.org/10.1016/j.jvoice.2008.04.006
  • 5
    Treole K, Trudeau MD. Changes in sustained production tasks among women with bilateral vocal nodules before and after voice therapy. J Voice. 1997;11(4):462-9. http://dx.doi.org/10.1016/S0892-1997(97)80043-4. PMid:9422281.
    » http://dx.doi.org/10.1016/S0892-1997(97)80043-4
  • 6
    Fant G. Acoustic theory of speech production. 2nd ed. Paris: Mouton; 1970.
  • 7
    Ladefoged P. Elements of acoustic phonetics. Chicago: University of Chicago Press; 1996.
  • 8
    Camargo ZA. Análise da qualidade vocal de um grupo de indivíduos disfônicos: uma abordagem interpretativa e integrada de dados de natureza acústica, perceptiva e eletroglotográfica [tese]. São Paulo: Pontifícia Universidade Católica de São Paulo; 2002. 283 p.
  • 9
    Silva MFBL, Madureira S, Rusilo LC, Camargo Z. Vocal quality assessment: methodological approach for a perceptive data analysis. Rev CEFAC. 2017;19(6):831-41. http://dx.doi.org/10.1590/1982-021620171961417.
    » http://dx.doi.org/10.1590/1982-021620171961417
  • 10
    Magri A, Stamado T, Camargo ZA. Influência da largura de banda de formantes na qualidade vocal. Rev CEFAC. 2009;11(2):296-304. http://dx.doi.org/10.1590/S1516-18462009005000010.
    » http://dx.doi.org/10.1590/S1516-18462009005000010
  • 11
    Lee S-H, Yu J-F, Hsieh Y-H, Lee G-S. Relationships between formant frequencies of sustained vowels and tongue contours measured by ultrasonography. Am J Speech Lang Pathol. 2015;24(4):739-49. http://dx.doi.org/10.1044/2015_AJSLP-14-0063. PMid:26254465.
    » http://dx.doi.org/10.1044/2015_AJSLP-14-0063
  • 12
    Titze I, Palaparthi A. Sensitivity of source-filter interaction to specific vocal tract shapes. IEEE Trans Audio Speech Lang Process. 2016;24(12):2507-15. http://dx.doi.org/10.1109/TASLP.2016.2616543.
    » http://dx.doi.org/10.1109/TASLP.2016.2616543
  • 13
    Camargo ZA, Vilarim GS, Cukier S. Parâmetros perceptivo-auditivos e acústicos de longotermo da qualidade vocal de indivíduos disfônicos. Rev CEFAC. 2004;6(2):189-96.
  • 14
    Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009;42(2):124-35. http://dx.doi.org/10.1016/j.jcomdis.2008.10.001. PMid:19054525.
    » http://dx.doi.org/10.1016/j.jcomdis.2008.10.001
  • 15
    Muhammad G, Mesallam TA, Malki KH, Farahat M, Alsulaiman M, Bukhari M. Formant analysis in dysphonic patients and automatic Arabic digit speech recognition. Biomed Eng Online. 2011;10:41. PMid:21624137.
  • 16
    Schwartz SR, Cohen SM, Dailey SH, Rosenfeld RM, Deutsch ES, Gillespie MB, et al. Clinical practice guideline: hoarseness (dysphonia). Otolaryngol Head Neck Surg. 2009;141(3, Supl 2):S1-31. http://dx.doi.org/10.1016/j.otohns.2009.06.744. PMid:19729111.
    » http://dx.doi.org/10.1016/j.otohns.2009.06.744
  • 17
    Ma EP, Yiu EM. Multiparametric evaluation of dysphonic severity. J Voice. 2006;20(3):380-90. http://dx.doi.org/10.1016/j.jvoice.2005.04.007. PMid:16185841.
    » http://dx.doi.org/10.1016/j.jvoice.2005.04.007
  • 18
    Lopes LW, Cavalcante DP, Costa PO. Severity of voice disorders: integration of perceptual and acoustic data in dysphonic patients. CoDAS. 2014;26(5):382-8. http://dx.doi.org/10.1590/2317-1782/20142013033. PMid:25388071.
    » http://dx.doi.org/10.1590/2317-1782/20142013033
  • 19
    Cohen SM, Pitman MJ, Noordzij JP, Courey M. Management of dysphonic patients by otolaryngologists. Otolaryngol Head Neck Surg. 2012;147(2):289-94. http://dx.doi.org/10.1177/0194599812440780. PMid:22368039.
    » http://dx.doi.org/10.1177/0194599812440780
  • 20
    Gonçalves MIR, Pontes PAL, Vieira VP, Pontes AAL, Curcio D, Biase NG. Função de transferência das vogais orais do Português brasileiro: análise acústica comparativa. Rev Bras Otorrinolaringol. 2009;75(5):680-4.
  • 21
    Ozkan H. A Comparison of classification methods for telediagnostics of Parkinson’s disease. Entropy. 2016;18(115):1-14.
  • 22
    Yamasaki R, Madazio G, Leão SHS, Padovani M, Azevedo R, Behlau M. Auditory-perceptual evaluation of normal and dysphonic voices using the voice deviation scale. J Voice. 2017;31(1):67-71. http://dx.doi.org/10.1016/j.jvoice.2016.01.004. PMid:26873420.
    » http://dx.doi.org/10.1016/j.jvoice.2016.01.004
  • 23
    Hosmer DW, Lemeshow S. Applied logistic regression. New York: Willey; 2000. http://dx.doi.org/10.1002/0471722146.
    » http://dx.doi.org/10.1002/0471722146
  • 24
    González CMT, Hernandez JBA, Orozco-Arroyave JR, Casals JS, Gallego-Jutgla E. Automatic detection of laryngeal pathologies in running speech based on the HMM transformation of the nonlinear dynamics. Lect Notes Comput Sci. 2013;1:136-43.
  • 25
    Van Houtte E, Van Lierde K, Claeys S. Pathophysiology and treatment of muscle tension dysphonia: a review of the current knowledge. J Voice. 2011;25(2):202-7. http://dx.doi.org/10.1016/j.jvoice.2009.10.009. PMid:20400263.
    » http://dx.doi.org/10.1016/j.jvoice.2009.10.009
  • 26
    Macari AT, Ziade G, Turfe Z, Chidiac A, Alam E, Hamdan AL. Correlation between the position of the hyoid bone on lateral cephalographs and formant frequencies. J Voice. 2016;30(6):757.e21-6. http://dx.doi.org/10.1016/j.jvoice.2015.08.020. PMid:26604010.
    » http://dx.doi.org/10.1016/j.jvoice.2015.08.020
  • 27
    Samlan RA, Story BH, Bunton K. Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computacional modeling. J Speech Lang Hear Res. 2013;56(4):1209-23. http://dx.doi.org/10.1044/1092-4388(2012/12-0194). PMid:23785184.
    » http://dx.doi.org/10.1044/1092-4388(2012/12-0194)
  • 28
    Lopes LW, Costa SLNC, Costa WCA, Correia SEN, Vieira VJD. Acoustic assessment of the voices of children using nonlinear analysis: proposal for assessment and vocal monitoring. J Voice. 2014;28(5):565-73. http://dx.doi.org/10.1016/j.jvoice.2014.02.013. PMid:24836362.
    » http://dx.doi.org/10.1016/j.jvoice.2014.02.013
  • 29
    Madazio G, Leão S, Behlau M. The phonatory deviation diagram: a novel objective measurement of vocal function. Folia Phoniatr Logop. 2011;63(6):305-11. http://dx.doi.org/10.1159/000327027. PMid:21625144.
    » http://dx.doi.org/10.1159/000327027
  • 30
    Lopes LW, Simões LB, Silva JD, Evangelista DS, Ugulino ACN, Costa Silva PL, et al. Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses. J Voice. 2017;31(3):382.e15-26. http://dx.doi.org/10.1016/j.jvoice.2016.08.015. PMid:27742492.
    » http://dx.doi.org/10.1016/j.jvoice.2016.08.015

Publication Dates

  • Publication in this collection
    22 Oct 2018
  • Date of issue
    2018

History

  • Received
    08 Jan 2018
  • Accepted
    09 Apr 2018
Sociedade Brasileira de Fonoaudiologia Al. Jaú, 684, 7º andar, 01420-002 São Paulo - SP Brasil, Tel./Fax 55 11 - 3873-4211 - São Paulo - SP - Brazil
E-mail: revista@codas.org.br