Electroglottography of speakers of Brazilian Portuguese through Objective Multiparameter Vocal Assessment (EVA)

Abstract EVA was designed to study various speech production parameters. Objective: This paper aims to define the mean values for electroglottography tests of Brazilian Portuguese speakers on EVA. Materials and Method: The voices of 20 men and 20 women without voice-related complaints were analyzed through electroglottography so as to obtain reference values for normality. Case study: this is a descriptive cross-sectional study. Results: The mean values for normal male voices were: F0 = 127.77 Hz; F0 coefficient of variation = 2.51%; absolute jitter = 1.707 Hz; relative average perturbation = 0.0083; jitter factor = 1.34%; jitter ratio = 13.45%; QF = 0.447. The values for female voices were: F0 = 204.87 Hz; F0 coefficient of variation = 1.58%; absolute jitter = 3.30Hz; relative average perturbation = 0.0102; jitter factor = 1.60%; jitter ratio = 16.23%; QF = 0.443. Wave type for the entire sample was categorized as tilted pulse. Conclusion: Statistically significant differences were found for gender on parameters average F0 and absolute jitter. While using acoustic analysis software, users must be based on parameters inherent to the software program when analyzing the collected data.


INTRODUCTION
Voice assessment in speech therapy can be performed through auditory perceptual analysis, considered the gold standard in speech therapy, or through acoustic analysis, a set of measurements performed from computer--generated tracings 1 .
Acoustic analysis added objectivity to speech assessment. Additionally, it allowed increased diagnostic accuracy, the identification and documentation of short and long term therapy results, and the possibility of providing patients with visual feedback 2 .
Electroglottography (EGG) is a non-invasive test that estimates the contact area variation between vocal folds as voice is produced 3 . It has been used in acoustic analysis since the 1940s in clinical and research settings 4 .
A center in France dedicated to studying speech and language developed a multiparameter method for objective assisted voice assessment (EVA) that uses the SESANE data processor. EVA was designed to study parameters in speech production such as sound, intensity, aerodynamic measurements, to name a few. It is equipped with a series of sensors to measure these parameters, and thus offers improved diagnostic capabilities and enhanced patient follow-up in terms of surgery, drug therapy, and phototherapy outcome 5 .
Acoustic analysis software for speech and voice differ in the way they calculate acoustic parameters, and the outcome of the measurements may be affected by linguistic variations stemming from language cultural patterns 6 . Results also vary depending on the recording instrumentation, ambient noise, gender or age of the speaker, which shows that the quality of the equipment used to record patient voice, the type of software, and the anatomical-functional of the larynx may affect measurements in the short run 1 . Normative values can only be assessed by means of standardized criteria agreed upon by consensus 7 . Standardization educates, simplifies, saves time, money and effort, aside from ensuring certification 8 .
There are no studies in the literature describing the use of EVA-based electroglottography in Brazilian Portuguese speakers.
The purpose of this study is to analyze mean values for fundamental frequency (F0), F0 coefficient of variation, absolute jitter, relative average perturbation (RAP), jitter ratio, jitter factor, mean closed quotient (CQ), and the interpretation of the electroglottography wave types of the EGG/ EVA software, so as to gather preliminary data on normal patterns of speakers of Brazilian Portuguese of both genders.

MATERIALS AND METHODS
This is a descriptive cross-sectional study. Forty native speakers of Brazilian Portuguese -20 males and 20 females -aged between 18 and 45 years were enrolled. The selected age range aimed at excluding individuals experiencing changes in their voices and presbyphonia. The mean age of female subjects was 28 years; male patients had a mean age of 30 years.
None of the subjects had voice-related complaints. Auditory perceptual analysis performed by two speech and hearing therapists did not show altered voice quality or any other communication disorder that could prevent them from performing the tests.
Enrolled subjects were informed of the purpose, procedures, and publication of the test results in this study, and signed an informed consent form. This study was approved by the Research Ethics Committee of our institution and was granted permit ETIC 0488.0.203.000-10.
All subjects had their voices recorded in acoustic signals and electroglottography and were asked to say the phrase "Mara lava a batata" twice in a row. The second utterance of the phrase was used for data analysis purposes due to its increased acoustic stability and the utterance of vowel /a/ in syllable /la/, as it is assumed that there is lesser influence from the vocal tract, given that the tonic syllable is located in the more central portion of the phrase.
Acoustic analysis software EVA was used to record and analyze speech samples. Recordings were done using a Dell Vostro 200 workstation and a professional -44 dBV AKG Acoustics C1000S condenser stereo multidirectional microphone. Two electrodes were placed on the wings of the thyroid cartilage and the informant was kept at a fixed 10 cm from the microphone to allow for proper capturing of the electroglottography signal.
Electroglottography measurements were made so as to obtain reference values for mean fundamental frequency (F0), F0 coefficient of variation, absolute jitter, relative average perturbation (RAP), jitter ratio, jitter factor, and closed quotient (CQ) ( Table 1). The software program's manual contains a detailed description of all analyzed parameters defined below 9 .
Mean F0 offers a general measurement of vocal frequency which corresponds to the number of sound waves comprised within one second. The unit of measurement is Hertz (Hz).
F0 coefficient of variation is the relative standard deviation compared against the mean F0. This measurement accounts for the magnitude of percent changes in comparison to the mean F0 value. For example, a standard deviation of 4.9 Hz for a mean F0 of 180 Hz results in a coefficient of variation of 2.7%. The same standard deviation for a mean F0 of 500 Hz provides for a much more significant coefficient of variation of 0.98%. The F0 coefficient of variation is the best indicator to explore the stability of the mean fundamental frequency duration, and is highly relevant in the detection of alterations such as tremor and other instabilities of neurological origin. It is measured as a percentage (%). Short term instability (absolute jitter) of the F0 results shows the changes in frequency between each oscillation cycle. It is calculated using absolute mean jitter and the mean F0 difference between two consecutive vibration cycles. These alterations can be accurately calculated for each cycle. It is measured in Hertz (Hz).
Relative average perturbation (RAP) measures the mean variation in three consecutive periods and denotes the mean period of the observed signal. This variable has no unit of measurement.
Jitter factor establishes a ratio between mean absolute jitter and mean F0. A mean jitter of 0.677 Hz and a mean F0 of 180 Hz correspond to a jitter factor of 0.38%. Jitter factor is a great indicator to explore the short term stability of the fundamental frequency. It is measured as a percentage (%).
Jitter ratio measures the mean variation seen in a period between two consecutive vibration cycles. A high jitter ratio always signifies a relevant F0 coefficient of variation, although the opposite is not true. Indeed, the small upward or downward variations on F0 between cycles does not produce a relevant jitter ratio, but may lead to significant global F0 variations, such as vibrato. The unit of measurement is permillage (%).
The closed quotient measures the ratio between the time closed (Tc) and the complete glottal cycle (Tc + To): CQ = Tc/(Tc + To). It is expressed as a percentage (%) (Figure 1).
The EVA software manual states that CQ normal values, based on French speakers, range between 0.4 and 0.6. Values between 0 and 0.4 suggest glottal hypoadduction and values greater than 0.6 and smaller than 1.0 suggest glottal hyperadduction.
In addition to closed quotient, electroglottographic waves were qualitatively analyzed, categorized, interpreted according to waveform characteristics, and related to templates of glottal geometric variation 10 : shifts uniformly towards the midline; 2. Peak skewing: occurs when there is increased glottal convergence, i.e., when a vocal fold is more acutely angled and wedged; 3. Bulging pulse: occurs when two knees are seen in the tracing, one going up and another going down; 4. Sloping pulse: occurs when there is a slight difference in the phase angles between and upper and lower margins of the vocal fold free borders, changing the waveform to a more quadrangular or triangular shape when the angle difference between upper and lower margins is greater 11 (Figure 2). Data statistical analysis was carried out using statistical package SPSS (Statistical Package for the Social Sciences) release 17.0. Initially, a descriptive analysis of the data was performed looking at central tendency and scatter measurements. The data followed a normal distribution. Therefore, the statistical analysis of the values between genders was done using Student's t-test with a confidence level of 95%. Table 2 shows minimum and maximum values, standard deviation and level of significance of electroglottographic measurements in females and male individuals.

RESULTS
There is statistically significant difference between genders for mean F0 and absolute jitter measurements. In the analysis of electroglottographic wave type according to Titze 10 , for both studied groups 100% of the subjects had peak skewing wave types.

DISCUSSION
Electroglottography looks into the contact pattern of vocal folds during the glottal cycle to assess vocal function 12 . A high frequency low amplitude charge is applied to the subject's neck structures and vocal folds through electrodes placed bilaterally on the neck 13 .
Human tissues conduct electricity reasonably well when compared to air 14 . The opening and closing of vocal folds cause impedance levels to vary in the larynx, thus altering the flow of electricity between the electrodes 12 . Current levels are affected by resistance levels, and consequently by tissue impedance 15 .
When vocal folds touch there is some flow of electricity, and as they move away from each other flow is significantly reduced. Only a small portion of the flow of electricity recorded shows the contact between vocal folds 16 .
The resulting electroglottogram (EGG) shows the variation of vocal fold impedance as a function of time. Impedance also varies considerably with skin type and vertical laryngeal motion. High-pass filters are used to eliminate low frequency interference and remove the variation caused by vocal fold vibration 17 .
Various objective measurements may be gathered from the analysis of electroglottograms. Parameters such as vibration fundamental frequency, amplitude perturbation, shimmer, frequency perturbation, jitter, and closed quotient 18 . 19 reported that EGG is a broadly accepted method to measure fundamental frequency and F0 perturbation.

Altuzarra & San Martin
Electroglottogram tracings can be interpreted in many different ways. One may consider the configuration of the tracing curves, their amplitude, cycle periodicity, and the presence or absence of knees 10 . The electroglottogram waveform reflects the amount of cross-sectional impedance at the level of the larynx; impedance readings fall as vocal fold contact increases 20 . Vocal function can be assessed by measuring the variations in contact time of the vocal fold mucosa in the posteroanterior and inferosuperior direction of the free border during a vibration cycle 10 .
Electroglottographic studies 11 performed in female patients without functional or anatomical disorders of the vocal tract showed a mean F0 value of 211.69 Hz with a standard deviation of 15.13, and a mean closed quotient of 0.455 with a standard deviation of 0.033. These results support this study ( Table 2) in terms of mean F0 (204.87 Hz) and closed quotient (0.443) values. Nevertheless, the standard deviations found in this study were higher than those presented in the paper mentioned above. These differences may be explained by the fact that the measurements were captured using different electroglottographic systems.
The control group of a study 21 looking into EGG findings in individuals with multiple sclerosis found mean F0 values and jitter factors similar to those reported in this study (Table 2), probably due to the similarities in subject age range and research method.
A study 22 that analyzed EGG findings of laryngeal tumor patients reported a mean F0 of 133.80 Hz and a mean jitter factor of 0.23% among members of the control group. The values found in our study were different from those cited above. The study mentioned above did not analyze their groups for gender, therefore data from male and female patients were combined. This difference may have contributed for the marked differences seen between studies, although the findings reported in the study mentioned above are similar to the data for the male subjects enrolled in our study.
Fundamental frequency is more easily derived from electroglottograms than from sound wave acoustic analysis as cycles can be seen more clearly, thus confirming the increased reliability in obtaining F0 data from EGG 14,19 . The mean F0 and standard deviation values seen in two studies 23,24 that analyzed Portuguese and brazilian portuguese speakers found practically identical values as those reported in this study ( Table 2).
There was a match in closed quotient values for male (CQ = 0.447) and female (CQ = 0.443) subjects enrolled in this study ( Table 2) when compared to other studies 11,[24][25][26] , thus confirming that individuals without laryngeal disorders, mainly nodules 27 , have CQ within normal ranges.
It is worthwhile mentioning that the statistically significant difference observed between male and female groups ( Table 2) for mean F0 was also reported in other studies 20 . There are no other papers in the literature reporting on other EGG parameters having gender as a reference.
The electroglottograms of all patients enrolled in this study had skewing peaks according to the categorization proposed by Titze 10 , as also reported in other studies 11,28 . Peak skewing occurs when there is increased glottal convergence, in situations where the vocal folds do not have free border disorders and show adequate closing 29 .
The parameters considered to assess normality among French speakers are F0 coefficient of variation, absolute jitter, relative average perturbation (RAP), jitter ratio, and jitter factor.
The values mentioned above (Table 1) do not match the findings reported in this study (Table 2). Our results relate to Brazilian Portuguese speakers, and the linguistic variations associated with the language's cultural standards may also affect speech and voice patterns. Those factors combined may lead to significant differences in the acoustic electroglottographic findings of speakers of different languages 6 .
More electroglottographic research using different software programs and looking into other languages is needed to allow for a better understanding of these variables and, consequently, to improve the analyses of these values in subjects with speech and laryngeal disorders. Utter standardization is not possible, as there will always be differences between software programs for speech acoustic analysis. Therefore, when using a software pro-gram for acoustic analysis, users must use as reference the parameters inherent to the program they are using to analyze the collected data samples.
The electroglottographic parameters that presented gender statistically significant differences were mean F0 and absolute jitter.
Peak skewing waveform was found in the electroglottograms of 100% of the subject sample of both genders.