Análise ultrassonográfica das líquidas alveolares e fricativas coronais: julgamento de juízes experientes e não experientes Ultrasonographic analysis of lateral liquids and coronal fricatives: judgment of experienced and non-experienced judges

Objetivo: verificar se a acurácia no julgamento das imagens ultrassonográficas (US) varia em função da experiência dos juízes e da classe sonora – líquidas alveolares e fricativas coronais. Métodos: imagens ultrassonográficas relativas à produção das líquidas e fricativas no contexto intervocálico da vogal /a/, produzidas por 20 adultos típicos, foram julgadas por 15 juízes experientes e 15 inexperientes. Uma análise prévia das imagens foi feita para estabelecer o padrão ultrassonográfico típico das líquidas e fricativas. Considerou-se na análise a acurácia (% de acertos e erros) dos julgamentos feito pelos juízes. Utilizou-se o teste estatístico Anova Fatorial, ABSTRACT Purpose: to verify if the accuracy in the judgment of the ultrasound (US) images varies according to the experience of the judges and the sound class – alveolar liquids and coronal fricatives. Methods: ultrasound images related to the production of liquids and fricatives in the intervocalic context of the vowel /a/, produced by 20 typical adults, were judged by 15 experienced judges and 15 non-experien-ced judges. A previous analysis of the images was performed to establish the typical ultrasound pattern of the liquids and fricatives. The accuracy (% of correctness and errors) of the judgments performed by the judges was considered in the analysis. Factorial ANOVA statistical test was used, considering as factors the sound class and the experience of the judges. Results: the Factorial ANOVA showed a significant effect for the accuracy of the judgment only for the sound class, with no significance for the judges’ experience or for the interaction between the experience of judges and the sound class. The liquids had a lower accuracy in the judgment, as compared to the fricatives, confirming their articulatory complexity, since they involve the production of two simultaneous gestures. Conclusion: the accuracy in the judgment of the US images did not vary according to the experience of judges, but it was dependent on the sound class. It should be noted, however, that all judges pre-sented previous knowledge about the processes of speech production which may have favored image interpretation.


INTRODUCTION
Ultrasonographic analysis, normally used to evaluate images of internal organs, has also been highly recommended for obtaining information about the real-time image of tongue movement during speech production [1][2][3] .
In order to obtain the ultrasound image of tongue movement (hereinafter TUS), it is necessary to place a transducer in the submandibular region of the speaker.This transducer will emit high frequency waves, which will be propagated to the interior of the oral cavity.Due to the density difference between tongue/air and air/ bone structures, the ultrasonic waves will be reflected and captured by the transducer.After the detection of these waves, the reflection point will be calculated resulting in the creation of an image 4 .Some authors [5][6][7][8] argue in favor of the use of ultrasonography of the tongue movement in clinical practice due to its affordable cost, when compared to the other equipment used in an articulatory analysis; portability of the equipment, not restricting the collection in laboratory situation; relative comfort provided to the individuals at the time of collection, since it is not an invasive instrument and nor requires previous preparation of them; beyond the visualization of the tongue surface-from its tip to the root -during the production of speech sounds.
However, it is known that the ultrasound image of the tongue surface (either in the coronal and/or sagittal plane) is not always clear, because the formation of the image depends on the difference in density between the structures of the vocal tract, varying from individual to individual.Additionally, in the ultrasound image of the tongue, the passive articulators are not visualized; many times, part of the tip-of-tongue information is lost; in addition to the difficulty in reducing the head movement during data collection, which presupposes the need for previous training of clinicians and/or researchers to interpret the images.
Another challenge for the TUS applicability would be the need for clinicians to present previous knowledge about the ultrasonographic pattern of tongue movement for the production of different phonemes.In the literature, there are authors that described different speech sounds 9 , while other authors have proposed to study classes of isolated sounds, such as occlusives 10 , fricatives 11 , liquids 5,[12][13][14] and vowels 15,16 .
Despite the difficulties inherent in the use of the TUS, the studies do not explicitly recommend the need of a previous training for researchers and/or clinicians for this type of analysis 2,7 nor they highlight whether there was a sound class that implies a greater difficulty of interpretation of the ultrasound image.
For Brazilian Portuguese (hereinafter BP), in particular, there is already a qualitative TUS description for the typical production in adults of the coronal liquids 5 and the coronal fricatives 17 .These sounds were privileged by the authors because they presented late acquisition and were commonly involved in substitution processes in atypical speech.
The ultrasound contour of the tongue of /∫/ was characterized by the raised tongue with presence of curvature of the dorsum to generate palatal turbulence; while for /s/ four distinct patterns were described: 1) absence of sharp curvature of the dorsum or root of tongue; 2) presence of sharp curvature of root and dorsum in descending direction; 3) presence of a sharp curvature of root and dorsum in an upward direction, and 4) the presence of sharp curvature and more anterior dorsum 17 .
Differently, the ultrasound pattern of the liquids /l/ and /r/ is characterized by the presence of double tongue gestures: one related to the tip of the tongue and the other to the dorsum of the tongue, making more complex sounds from an articulatory point of view.However, the degree of constriction of the tip gesture of the tongue is greater in the /r/ when compared to /l/, while the gesture of dorsum of the tongue towards the pharynx is more evident in /l/ 5 .Therefore, it can be deduced from the ultrasound descriptions made above that the articulatory complexity varies according to the sound class and it can be an important factor in the image analysis.
Considering the inherent difficulties in the use of TUS in the clinical context allied to the different articulatory complexity of the sound classes, the hypotheses of the study are defined as follows: (1) Experienced judges in TUS analysis would present better accuracy in the image judgement than non-experienced judges.
(2) TUS images related to coronal liquids (/l/ and /r/) would produce greater difficulty in the judgment than images related to coronal fricatives.
The purpose of the present study is to verify if the accuracy in the judgment of the ultrasound images varies according to the experience of the judges and the sound class -alveolar liquids and coronal fricatives.

METHODS
The present study was approved by the Research Ethics Committee of the Faculdade de Filosofia e Ciências da Universidade Estadual Paulista -UNESP of Marília (Faculty of Philosophy and Sciences of the São Paulo State University -UNESP of Marília), nº1.268.673/2015.All individuals included in the research were informed and signed the informed consent form (ICF) presented to them.

Participants
Thirty judges, divided into two groups, participated in the present study: 15 judges who were experienced in ultrasonographic analysis of tongue movement (G1) and 15 judges who had no previous experience with this tool of speech analysis (G2).
The judges were recruited in the Undergraduate and Post-Graduate courses of Speech-Language Pathology of the home institution.For the composition of the G1 (experienced judges) it was required previous knowledge on ultrasound analysis of tongue movement in the speech production, as a short duration course or technical training in ultrasound analysis of tongue movement.Because there were not many clinicians with previous experience in TUS analysis, the time of experience with the use of this technique was not controlled.For participants of G2, the inclusion criterion was the previous knowledge about the speech production process, besides the phonetic classification and description of the different phonemes of Brazilian Portuguese.All judges should necessarily have completed a course in Phonetic and Phonology of Brazilian Portuguese with an approved performance.The exclusion criterion was the lack of the necessary knowledge for the analyzes that would be performed.Figure 1 presents the characterization of the judges (G1 and G2).

Procedure Stimulus
Ultrasound images related to the production of logatome involving the coronal liquids (/l/ -[a'la] and /r/ -[a'ɾa]) and coronal fricatives (/s/ --[a´sa] e /∫/ -[a´∫a]), in the intervocalic context of the vowel /a/, of 20 adults, Brazilian Portuguese speakers, monolingual, with typical speech production in the 20-30 age (10 men and 10 women).For each production, using Sound Forge Studio 6.0 software, it was selected frames corresponding to the maximum point of tongue constriction of the phoneme production, static ultrasound images, computing a total of 80 frames: 20 frames relative to the production of /l/, 20 relative to the production of /r/, 20 frames corresponding to the production of /s/ and 20 frames corresponding to the production of /∫/.
The data that constitute this database were collected using a portable ultrasound, model DP 6600, containing transducer coupled to a computer, unidirectional microphone and head stabilizer.The acoustic signals and image were recorded simultaneously with the use of the AAA (Articulate Assistant Advanced) software combined with a synchronizer that allows synchronization between the images and the acoustic signal.The US images were acquired with a frequency of 6.5 MHz, 120 ° of image field and 29.97 Hz of sampling rate.

Encoding of images
A previous analysis of the ultrasound images was made by two experienced examiners, in order to verify not only the correspondence with the ultrasound pattern described as typical of the coronal liquid and fricatives, but also the need for adjustments in this description.
After the inspection of the images and inspiring on the existing description in the literature 5,17 , the categories for the assessment of the TUS images described in figure 2 were defined.
Once the categories were defined, the two experienced examiners independently judged all ultrasound images to obtain an accuracy of 100% and 87.5%, respectively.Kappa was calculated by obtaining a degree of agreement of 1.00 (p <0.05).

Judgment of the images
The judgment task was elaborated so that the judges could analyze the pair of static ultrasound images of the same speaker and not only a single image, in order to minimize differences in the structures of the vocal tract between the individuals, as well as the positioning of the transducer in the capture of the images.
Therefore, the ultrasound images related to the maximum point of tongue constriction in the production of liquids /l/ and /r/, on the one hand, and the fricatives /s/ and /∫/, on the other hand, were paired and organized in two different PowerPoint files for the judgment, as shown in figures 3 and 4.
Before the judgment of the images itself, a description of the expected ultrasound pattern for the coronal liquids and fricatives was provided in a Finally, a statistical and inferential analysis of the data was made considering the accuracy of the judges in the judgment in terms of percentage of correctness.A Factorial ANOVA was used in the analysis to verify if the accuracy in the judgment of the images varied according to the previous experience of the judges and the sound class.
PowerPoint program file, since not only important information was presented for the interpretation of the images (for example, where different parts of the tongue are localized), but also examples of the ultrasound pattern of each one of the possibilities of analysis.
The judges could return to the description and initial example as many times as they thought necessary.In both judgments, that is, of the liquids and fricatives separately, the task did not exceed 10 minutes.The Factorial ANOVA (2 factors) was used in the analysis, considering the experience of the judges (experienced vs. non-experienced) and the sound class (liquids and fricatives) as factors that could influence the accuracy of the judgment (Table 2).

RESULTS
The accuracy in the images judgment of the liquids and fricatives in function of the experience of the judges can be seen in the Table 1.
Table 2. Factorial ANOVA values regarding the main effects of the sound class and experience of the judges, as well as the interaction between sound class and experience of the judges A significant main effect was verified only for the accuracy of the judgment in function of the sound class (F (1,56)=9,66, p>0.00), with no significance for the experience of the judges (F (1, 56)=1.82,p>0.17) nor for the interaction between experience of the judges and sound class (F (1,56)=0.002,p>0.96).It is observed that although, numerically, the experienced judges present a greater accuracy than the non-experienced ones, this difference was not statistically significant.

Sound class
Differently, the ultrasound images corresponding to the fricative class had a greater percentage of correctness in the judgments, compared to images related to liquids. Figure 5 illustrates the results found.
have received previous explanations regarding the US pattern.This training of the non-experienced judges may have been a differential that resulted in the results found.The widespread use of US must be carefully estimated, especially from more research and with different levels of judges.
It is interesting to note that there is no explicit description in the literature of the need for previous training by researchers and/or clinicians to enable the interpretation of ultrasound images of the tongue.In the two classic studies of the area 1,2 , the authors highlight the TUS as an effective instrument in the analysis of speech production, compared to other articulation techniques (magnetic resonance (MRI), X-rays, EMA, micros X-rays), since these, in addition to high cost of the equipment, the handling and portability are also limiting aspects for clinical use.In addition, the TUS allows to capture the contour of the tongue in a dynamic way and real time, allowing the study of the movement of dorsum to the tip of the tongue during the production of lingual consonants and vowels, being an allied tool in the diagnostic process of speech disorders, as well as in the therapeutic process (use of biofeedback in the therapeutic context).
However, it should be emphasized that both groups of judges in the present study had previous knowledge not only about the speech production process, but also about the articulatory characteristics of Brazilian

DISCUSSION
The present study aimed to verify if the accuracy in the judgment of the ultrasound images varies in function of the experience of the judges and the sound class -coronal liquids and fricatives.
Considering the difficulties inherent in the use of TUS in the clinical context combined with the different articulatory complexity of the sound classes, it was expected that (1) experienced judges in TUS analysis would present better accuracy in the image judgment than non-experienced judges and (2) the TUS images relative to the coronal liquids (/s/ e /r/) would imply greater difficulty in judgment than images relative to the coronal fricatives (/s/ e /∫/).
The first hypothesis related to the experience of the judges was not confirmed as far as there was no significant statistical difference in the percentage of correctness in the judgments of both groups of judges: experienced and non-experienced on the ultrasound analysis.It was verified that in both groups of judges the accuracy in the judgment was greater than 85%, suggesting an excellent performance by the judges.All judges (experienced and non-experienced) came from the same institution.Although they did not have specific training in the visualization of the US images, the non-experienced judges had a proven phonetic background and quite detailed regarding the production of speech sounds.In addition, they no articulatory description emphasizing the movement of the posterior part of the tongue.It is observed in the phonetic description of /l/ and /r/ of BP that the difference in production between these sounds lies in the duration parameter and in the ballistic movement of the tip of the tongue necessary to the production of /r/ [20][21][22] .
Based on the limitations of the present study, regarding the non-standardization of the control of the experience time of the first group of judges, besides the non-control of the response time during the judgment task, the feasibility of the use of TUS in clinical practice as a complementary tool for analysis of speech production was confirmed.

CONCLUSION
The study showed that experienced and non-experienced judges did not present difference in terms of accuracy in the judgment of ultrasound images.However, both groups had proven training in phonetics, received previous instructions for the judgement and were originated from the same institution.
The liquids class presented a lower accuracy, in their judgment, as compared to the fricatives one.
TUS can be carefully disseminated in the clinical practice of the speech-language therapist as a therapeutic and diagnostic support resource, however, one should have theoretical knowledge of the phonemes production and knowledge about the TUS pattern in function of sound class.
Portuguese phonemes, a fact that may justify similar performance between them.In this way, what allows the correct interpretation by the part of the judges is the previous training on aspects of speech production and not the specific one, on the technique of speech production analysis, in our case, the ultrasonography of the tongue movement.It is assumed, however, that training in the Phonetics area, such as expected in the curricular guidelines of the Speech-Language Pathology and Audiology courses (National Council of Education, 2002), provided the judges with not only a familiarity in identifying the main active articulator of the speech production -the tongue -in the sagittal plane in the TUS images, but also recognition of the different contours of the tongue surface corresponding to the production of the coronal liquids and fricatives.
In relation to the judgment of the two sound classes, it was hypothesized that the liquids class would impose a greater difficulty in judgment than the fricatives one.This hypothesis was corroborated to the extent that there was a significant difference of accuracy as a function of the sound class.The liquids presented a lower accuracy of judgment when compared to the fricatives.
An explanatory possibility for this accuracy difference lies in the very articulatory complexity present in the liquids ones.
Previous studies 18,19 that have used TUS images to describe coronal liquids /l/ and /r/ characterized these sounds as being the most complex as far as they involve the production of two simultaneous articulatory gestures: one relative to the tip of the tongue and another relative to the dorsum of the tongue.The difference of images between the liquids was detected by overlapping the curves of /l/ and /r/.The result of this overlap showed that in /l/, the dorsum gesture of the tongue towards the pharynx is more evident in /l/ than in /r/, while the tip gesture of the tongue is greater in /r/ when compared to /l/ 5 .Another argument in favor of liquids complexity is that, even though there is variability in the production of /s/, as described in a previous study 17 this factor did not contribute to the decrease of the accuracy of this class in comparison to liquids ones.
In addition, because the tip of the tongue is not always clear in TUS images, the judges were requested to observe mainly the direction of the dorsal gesture in the judgment of the liquids.Perhaps, this fact may have contributed to explain the lower accuracy of the liquids in relation to the fricatives, since there is usually The image (a) represents the production of /l/ and the arrows indicate the presence of two simultaneous gestures, the tip and the dorsum of the tongue (with dorsum gesture of the posteriorized tongue, towards the pharynx); the image (b) illustrates the production of /ɾ/ and the arrows show the presence of two simultaneous gestures of the tip and the dorsum of the tongue (in this figure, the dorsum gesture of the tongue is found less posteriorized); the image (c) refers to the production of /s/ and shows the tip and blade of the anteriorized tongue (the arrow shows the presence of groove); the image (d) refers to the production of /s/ and shows the tip and blade of the anteriorized tongue (the arrow shows the absence of groove); Finally, the image (e) shows the production of /∫/ and the arrow identifies the raised tongue in concave shape.Source: Research Data

Figure 2 .Figure 3 .
Figure 2. Example of the ultrasound pattern of the target productions of the liquids /l/ and /r/ and fricatives /s/ and /∫/

Figure 4 .
Figure 4. Example of image showing the maximum point of constriction of the tongue to the production of the fricatives /s/ and /∫/ presented to the judges for the judgment task.

Figure 5 .
Figure 5. Accuracy of the judgment based on the experience of the judges and sound class

Table 1 .
Mean and standard deviation of the percentage of correctness in the judgment of the ultrasound images