Thyroarytenoid muscle and vocal fry: a literature review


O trabalho do fonoaudiólogo utiliza-se de exercícios vocais como o som basal (SB), que se origina da grande atividade contrátil do músculo laríngeo intrínseco tiroaritenoideo (TA). O objetivo deste estudo foi revisar a literatura relacionada ao TA e ao SB. Realizou-se levantamento bibliográfico dos últimos 20 anos sobre o assunto nas bases de dados LILACS, SciELO, PubMed, Web of Science e Google Scholar. Verificou-se que o feixe interno do TA apresenta fibras de contração lenta, isotônicas, resistentes à fadiga; o feixe externo apresenta fibras de contração rápida, fatigáveis, isométricas. O SB caracteriza-se pela percepção dos pulsos de vibração glótica durante a emissão nas frequências mais graves da tessitura vocal (crepitação em graves ou vocal fry), principalmente pela ação do TA, especialmente sua porção interna, que se encurta de forma evidente, soltando a mucosa em grande volume ao longo da borda livre, aumentando a pressão subglótica e os níveis de jitter, shimmer e ruído, e reduzindo o fluxo aéreo. Com base na literatura, a exercitação isométrica do TA externo ocorreria com o SB sustentado na frequência mais grave possível ao sujeito (contração máxima), durante seis segundos, de cinco a dez vezes diárias, compatível com o predomínio de fibras de contração rápida. Na exercitação isotônica do TA interno, utilizar-se-iam sons agudos para estirá-lo, alternando emissões em SB (contração concêntrica) e em registro modal de cabeça ou em falsete (sons hiperagudos) (contração excêntrica), com várias séries diárias de oito a 12 repetições, compatível com o predomínio de fibras de contração lenta.

Voz; Músculos laríngeos; Fonação; Reabilitação; Fonoterapia

The work of speech-language pathologists uses exercises such as the vocal fry (VF), which originates from the great contractile activity of the intrinsic laryngeal thyroarytenoid muscle (TA). The aim of this study was to review the literature related to TA and VF. A literature review of the last 20 years on the subject was performed in the databases LILACS, SciELO, PUBMED, Web of Science and Google Scholar. It was found that the internal beam of the TA has slow twitch fibers, isotonic, resistant to fatigue; the external beam provides fast twitch, fatigable and isometric fibers. The VF is characterized by the perception of the vibration of the glottal pulses during the emission of the lowest frequencies in the vocal range (crackling in bass or vocal fry), mainly by the action of the TA, especially its inner portion, which shows evident shortening, dropping the mucosa in great volume along the free edge, increasing subglottic pressure and jitter, shimmer and noise levels, and reducing the airflow. Based on the literature, the isometric exercise with the external TA happens with VF sustained in the lowest possible frequency to the subject (maximum contraction), for six seconds, five to ten times daily, consistent with its predominance of fast twitch fibers. In the isotonic exercise with the internal TA, high sounds must be to stretch the muscle, alternating emissions by VF (concentric contraction) and in modal register and falsetto head (high-pitched sounds) (eccentric contraction) with several daily series of eight to 12 repetitions, consistent with the predominance of slow twitch fibers.

Voice; Laryngeal muscles; Phonation; Rehabilitation; Speech therapy


IDepartment of Speech-Language Pathology and Audiology, Universidade Federal de Santa Maria - UFSM - Santa Maria (RS), Brazil

IIUndergraduate Program in Speech-Language Pathology and Audiology, Universidade FEEVALE - Novo Hamburgo (RS), Brazil

IIIUndergraduate Program in Physical Therapy, Universidade Federal do Pampa - UNIPAMPA - Uruguaiana (RS), Brazil

Correspondence address


The work of speech-language pathologists uses exercises such as the vocal fry (VF), which originates from the great contractile activity of the intrinsic laryngeal thyroarytenoid muscle (TA). The aim of this study was to review the literature related to TA and VF. A literature review of the last 20 years on the subject was performed in the databases LILACS, SciELO, PubMed, Web of Science and Google Scholar. It was found that the internal beam of the TA has slow twitch fibers, isotonic, resistant to fatigue; the external beam provides fast twitch, fatigable and isometric fibers. The VF is characterized by the perception of the vibration of the glottal pulses during the emission of the lowest frequencies in the vocal range (crackling in bass or vocal fry), mainly by the action of the TA, especially its inner portion, which shows evident shortening, dropping the mucosa in great volume along the free edge, increasing subglottic pressure and jitter, shimmer and noise levels, and reducing the airflow. Based on the literature, the isometric exercise with the external TA happens with VF sustained in the lowest possible frequency to the subject (maximum contraction), for six seconds, five to ten times daily, consistent with its predominance of fast twitch fibers. In the isotonic exercise with the internal TA, high sounds must be to stretch the muscle, alternating emissions by VF (concentric contraction) and in modal register and falsetto head (high-pitched sounds) (eccentric contraction) with several daily series of eight to 12 repetitions, consistent with the predominance of slow twitch fibers.

Keywords: Voice/physiology; Laryngeal muscles/physiology; Phonation/physiology; Rehabilitation; Speech therapy


The human voice is the result of complex interactions between muscle, connective, epithelial, cartilage, ligaments, nerve and bone elements. The voice is produced intentionally or as a reflex, playing an artistic and a communication role throughout human evolution(1). From this complex voice generator, it is noteworthy the intrinsic thyroarytenoid (TA) laryngeal muscle, primary component of the oscillator that generates sound waves: the vocal folds (VFs), also called the vocal muscle(1-5).

The natural emission of voice requires muscle activity, including the TA muscle, which participates in the glottal adduction and in the production of both low and high frequency sounds. Also the work performed by audiologists uses vocal exercises related to the activity of specific muscles, in conformity with treatment goals, such as the technique of vocal fry (VF), held by the great contraction of the TA muscle(1,3-9). Therefore, it is necessary for the speech-language pathologist to know the neuroanatomical and physiological traits of the TA and its role in the technique of VF used in the clinic, to support the criteria for its use and indication.

In general, it can be stated that the literature lacks researches focused on physiology and the effectiveness of the exercises used for voice disorders. The physiological goal of most vocal exercises is still widely considered in intuitive terms(2), which leads the present work to a systematization of current knowledge to improve understanding about both: TA and VF. In the otorhinolaryngology practices, TA acquires importance as a major responsible for voice production, supporting the mucosa oscillation that generates the voice signal. The neuroanatomical and physiological conditions of the TA and the characteristics of spontaneous speech in vocal fry allow the otorhinolaryngologist to investigate traumatic brain injuries and neurological diseases that cause severe dysarthrophonia which may present its first manifestations at the laryngeal/phonatory level, requiring specific referrals(1,10).

In view of all this, the aim of this study was to conduct a review of the literature related to the TA and VF. To this end, we performed a literature review on the subject in the databases LILACS, SciELO, PubMed, Web of Science and Google Scholar, covering the last 20 years.


Skeletal muscle

A skeletal muscle is the one in which the functional unit is the muscle fiber, a thin, long, multinucleated cell called myocyte, with diameters ranging between 10 and 100 micrometers. Each muscle fiber consists of thousands of myofibrils or contractile elements, cylindrical filaments that fill the interior of the cell, whose diameters range from 1 to 2 micrometers. These myofibrils are composed by myofilaments of proteins (myosin, actin, tropomyosin and troponin) arranged side by side, resulting in the sarcomere, which is the unit responsible for muscle contraction(11). The ability to contract is a characteristic of the muscle tissue, whose action occurs by the sliding of myofilaments, one over another, causing contraction(12).

Muscle fibers are united into groups to form fascicles that, when grouped within a layer of connective tissue, forms a whole muscle. The skeletal muscle contractions are mostly voluntary, controlled by the nervous system and triggered by stimulus that reaches the muscle through the synapse between a neuron (motoneuron) and the muscle fibers innervated by it. This set is called a motor unit. Each axon innervates identical muscle fibers, in which the number of axons varies according to the size of the muscle innervated(13).

The speed and strength of the contraction are determined by the number of activated motor units(11,13). The control of muscle strength is essential to the accomplishment of any movement, including voice production.

A skeletal muscle, adapted to each motor act, usually has all types of muscle fibers, although it presents a variable distribution, the predominance of a certain fiber type always occurs(13).

The skeletal muscle fibers differ in their resistance to fatigue and in the development of tension, in straight accordance with their own oxidative capacity of transforming chemical energy into a mechanical one(11), in which are known two types: slow-twitch fibers (ST): red, type I, tonic; and fast-twitch fibers (FT): white, type II, or phasic(3-7).

The ST fibers (Type I) are adapted to continuous muscle contractions, with fatigue resistance, they do not support overload, they do support repetitive activity of low demand instead, since their maximal contraction is lower, they're recruited even in anaerobic activities. These fibers carry many mitochondria, are bulky and have high levels of myoglobin, substance responsible for its red color(11).

The FT fiber (Type II) can support overload, shows rapid and discontinuous contractions, although little repetitive, and are more susceptible to muscle fatigue. They can be classified as FT-A or IIa, FT-B or IIb, and FT-C or IIc, according to their motor function(11,14).

The FT-A or IIa fibers have the characteristics of the FT-B and ST fibers (more resistant). They are larger in diameter than the ST, with stronger tightening and greater contractile speed, it maintains the contraction for a longer period of time, with both aerobic and anaerobic capacity, and are considered an intermediate kind. The FT-B fibers or IIb, also called fast-oxidative-glycolytic, are fast, possess the greatest anaerobic potential and little resistance to fatigue, the real fast twitch fiber. The FT-C or IIc muscle fibers are more rare and may participate in the enervation or in the process of transformation of motor units(13,14).

In voluntary muscle contraction, regardless of the intensity of the exercise, the ST fibers are activated first, with recruitment of fibers of type FT-A when there is a need of fast and powerful energy supply. The FT-B fibers are recruited only in maximum or near maximum levels of muscle activity(11,14).

In denervated canine larynges, we found that FT fibers atrophied more and more rapidly than the ST ones, and reduced its diameter while the ST, even in the process of atrophy, underwent through an enlargement of the diameter. These changes occurred, generally within three months after denervation(15).

Thyroarytenoid muscle

The thyroarytenoid muscle (TA) is a paired skeletal muscle that is part of the intrinsic muscles of the larynx and each comprises a vocal fold; it receives motor and sensory support from the vagus nerve through one of its branches (cranial nerve X), the inferior laryngeal nerve (ILN) is located laterally to the elastic membrane and to the thyroarytenoid ligaments(1,3-6,8,9,14,16-19).

The TA has two main muscle beams: the tiromuscular or external (external TA) which is inserted in the inner surface of the thyroid cartilage, extending itself until the muscular process of the arytenoid cartilage; and the vocalis or tirovocal (internal TA), or vocal muscle, inserted into the prior inner surface of the thyroid cartilage and posteriorly in the vocal process of the arytenoid cartilage(1,3-6,8,16-19). There is a third beam of muscle fibers that composes the TA, the thyroarytenoid muscle called ventricular or vestibular, directed upwards in relation to the vestibular folds(1,3-6,8,16). In addition to these three beams, the thyroepiglottic muscle presses the epiglottis through the contraction of TA fibers that extend all the way to the aryepiglottic folds(1,8).

Regarding the innervation, the internal laringeal nerve (ILN) emerges from the vagus nerve, circumvents the brachiocephalic stem to the right and the aorta to the left, passing through the tracheoesophageal groove, presenting an immense relationship with the inferior pole of the thyroid gland until it makes its way into the larynx through the bottom edge of the inferior constrictor pharyngeal muscle, reaching the posterior larynx. It is divided into five motor branches responsible for all the intrinsic muscles of the larynx except for the cricothyroid (CT), which is innervated by the superior laryngeal nerve of the vagus, which also presents an afferent branch responsible for the sensitivity of the subglottis(1,3-5,14,16,17).

The intrinsic muscles of the larynx, including the TA, performs precision movements, with tension and postural adjustments of the vocal folds(1,3-5,10,16,18,19), cooperating with the extrinsic muscles(3-5,10,19-22), comprised by a ST and FT fiber types(8,11,15,23).

The fastest intrinsic muscles, such as the TA, the lateral cricoarytenoid and the interarytenoids show a higher number of FT-A fibers, therefore more fatigable, while the slower ones, as the cricothyroid and CAP, are rich in ST fibers, being more resistant to fatigue(8,11,15,23-25).

The external TA is rich in FT fibers that are highly fatigable(1,3-7,14,23), its contraction is isometric, involving resistance without movement and tension without stretching(1,3-5,11,14), been predominantly considered as an adductor, highly important for the production and maintenance of glottal stability with steady glottal closure in the face of higher levels of vocal intensity (subglottic airway pressure)(1,3-7,26).

In dogs, is also observed that the ST fibers hardly exist in the external TA, and that, after denervation, the TA suffers a quick atrophy, due to the predominance of FT fibers(15).

The contraction of the internal TA increases the glottal strength and endurance hence regulating vocal intensity(3-5,19,25), increasing the mass and decreasing the length of the vocal folds, closing the distance between the thyroid and arytenoid cartilage, causing the accumulation of mucosa mass that coats the vocal folds(1,3-5,9).

The mass increase of the vocal folds amplify its inertia to the airstream, slowing the speed of vibration cycles, producing low frequency sounds, considering that the internal TA is the responsible for the tension that produces bass sounds(1,3-9,19).

This muscle is rich in ST fiber, of isotonic contraction(23,25), which probably helps in finer movements of the vocal folds during speech, such as prolonged adduction for extended communication proposals(7).

It is considered that the intrinsic muscle of the larynx is relatively resistant to fatigue(22,24) and some authors(3-7) attach such resistance to the internal TA due to the predominance of ST muscle fibers rather than to the external TA.

The cricothyroid muscle, responsible for the production of high frequency sounds, or acute, is considered more resistant than the TA, therefore it is recommended the adoption of a more acute vocal pattern during vocal activities that are more demanding, as a way to minimize or avoid vocal fatigue(3-6).

However, the ability of a muscle to sustain contraction during a prolonged period of time is related to several factors: the type of fiber distribution, its organization, traction angle, muscle length, joint angle, speed of contraction, and motor unit activation(11,12,20).

In addition to the muscle characteristics, vocal fatigue may be influenced by an increased viscosity of the vocal folds, reduction of the blood flow and tension of the non-muscular tissues(2,6,7,14,22). Muscle fatigue may arise after prolonged periods of speech, as a negative adjustment that happens as a result of prolonged use of the voice(7,14,20,22), regardless of how well the muscles are developed.

Based on the abovementioned, it appears that the definition and characteristics of vocal fatigue are still uncertain, and they don't lead to a consensus on the subject(2,6,7,14,22).

As for the top beam of the human TA (vestibular thyroarytenoid), very little is known, due to the fact that only half of the human population presents it(1), but it may contribute to the constriction of the laryngeal vestibule, benefitting the approach of the medial vestibular folds, adding pressure on the mucous glands of the laryngeal ventricle, releasing lubricating secretions(3,4,6,7,19).

Basal register and vocal fry

The term "vocal register" still isn't a consensus in literature(1,18,19,27), but some authors relate it with the different forms of vocal emissions that covers the full range of sounds of the human voice, from the deepest to the highest pitch, in which the frequencies have an uniform quality of emission, occurring variations related to relative changes in the cross section of the vocal folds, produced by differential contractions of the intrinsic muscles, representing the three main vocal registers of human voice: vocal fry, modal and high or falsetto(1,3,4,18,19,26,27).

The basal register (BR) or pulsatile, is characterized by the perception of the glottal vibration pulses during the emission(1,19,27-30), which allows us to compare this emission to the sound of a "motor boat", the "creak of a door" ,"food frying" or to the "act of dragging a stick around a fence"(1,27), presenting the possibility of being produced during both: inhalation and exhalation(29).

During emission in BR, the vestibular fold medialize itself more sharply than during the modal register, the vocal folds are shortened, and glottal closure is stronger(1,3,4,6,7,9,18,19,21,30,31). It is likely that this firm closure is due to the strong lateral compression of the vocal folds, provided by the outer portion of the TA(3,4,19). Even with the firm glottal closure, the mucosa can be found loose and in great volume along the free edge due to the shortening of the vocal folds(1,6,7,9,18,19,21,30,31). The subglottic air rises in the form of bubbles between the vocal folds, approximately at the junction of the prior two thirds of the glottis(9,27,32,33).

The vocal processes of the arytenoids appear to be leveraged anterior and medially and may further reduce the posterior glottic gap(18). The thickening of the vocal folds, together with the decreased stiffness of the vocal ligament during the BR may be the main contributing factors to reduce the speed of vocal fold vibration by altering the characteristics of the vibratory cycle, in which the vocal folds open and close from one to three times in rapid succession (depending on the author) and after that, they come to a closed position for a longer period of time, in a vibration pattern of pulses that also characterizes this register(1,3,4,6,7,9,18,19,21,28,30-36).

In this register, the larynx gets lowered in the neck(3-5,19,28,37) and the vestibular folds get contracted to the point that they come into contact with the vocal folds, reducing or even eliminating the ventricular space, causing the thick and compact laryngeal structure of this register, which features the reduced ventricular space of the muscular setting during BR, when compared with the modal register(19).

It is possible to observe In the aerodynamic evaluation of the emission in BR, an increased subglottic pressure and a reduced airflow(1,18,19,26,29,32-35).

As for the acoustic characteristics, there are different opinions among authors as to the frequency range that can be subsumed by the BR, ranging from 20 to 90 Hz, but all agree that these are the lowest frequencies that the human voice can produce(1,3,4,6,7,9,18,19,21,27,29,30,32-34,38) . We found that the standard deviation of fundamental frequency during phonation greater in BR than in modal register(34,35).

The levels of frequency disturbance (jitter) during the BR emissions are also significantly higher than those observed in the modal register, both in normal and in pathological voices, the same happened with the measures of intensity of disturbance (shimmer) and with the presence of noise(1,18,19,32-35).

In the perceptual evaluation, there are clear differences between emissions in BR and normal voices or even dysphonic. The voice produced in BR presents tension, little modulation and low intensity(1,19,29), and can also be called as crackling voice(1,3,4,10,19,27).

The crackling voice may also be called vocal fry or glottal fry in American English(1,27-29), or creaky voice in British English(1,27-29), concepts that gets mixed emissions in BR. However, there isn't necessarily an equivalence between them, in other words, the BR is featured by a very low fundamental frequency, with distinctive crackling(28,29,34,35). However, crackling can occur in any frequency of the vocal range, not only in BR(1,29). Thus, the term "vocal fry" is used to refer exclusively to the crackling in the BR, while the term "creaky voice" indicates the crackling introduced into any type of emission"(1).

Regarding different genders , it is common to both, men and women, two main features of the BR: The occurrence of multiple opening and closing phases to each vibratory cycle (glottal pulses)(1,3,4,6,7,9,18,19,21,27,28,30-36) and the same range of frequencies in its production(1,3,4,6,7,9,18,19,21,29,30,32-35,38) . In addition, men and women had significantly high contact ratio between the vocal folds during BR when compared to the modal register(30) in an electroglottography.

However, it was found that women had a significantly higher ratio of contact between the vocal folds during the BR, when compared to men's rate, indicating greater asymmetry between the duration of the opened and closed phases of the glottal cycles in women, confirming that way that gender differences exist not only in modal register but also in BR(30).

The continuous emission in BR (also known as pulsed, considering the abovementioned features of the BR) in speech, syllables or in vowels, used in speech therapy as a therapeutic technique, sets up the traditional vocal fry(1,3-5,9,16,18,19,21,27-30).

In the researched literature, there is only one national proposal, published in a monograph and in two chapters of a book(3,4,19) through which they've described two ways of implementing the vocal fry (VF), in which the physiological characteristics differs between: tense and relaxed.

The production of the relaxed VF would be characterized by a lower position of the larynx, the predominant action of the TA muscle (especially its internal beam) minor participation of the vestibular folds, looser edges of the vocal folds, a more relaxed laryngeal structure and by simple glottal pulses with weak intensity. The tense VF, on the other hand, would be characterized by elevation of the larynx, with consequent increase in glottal adduction, action of the lateral cricoarytenoid (LCA) and external TA muscles, greater participation of the vestibular folds, rigid vocal fold edges, tense laryngeal framework, double or even triple glottal pulses and mild intensity or slightly higher than in the relaxed VF, can also be called: fry of scarce pulse(3,4,19).

As for the muscle activity used to produce the VF, the literature refers to a predominance of the LCA and the TA muscles(3,4,19), although that would be mainly related to the production of the tense VF(3). Some authors consider that the nomenclature through which both types of VF are characterized, tense and relaxed, proved to be adequate and didactic(7,19,31). However, field research to document the physiology and the acoustic characteristics of these both types of VF is still needed, since they were not found in the literature.

Most authors agree that the VF is predominantly produced by the action of the TA muscle that forms the body of the vocal folds, which is considered the main tensor that promotes the low frequencies sounds of the human voice(1,3-9,18,19,29,34,35). However, it is not mentioned any physiological differences in the implementation of the VF.

Since the jitter is related to vocal roughness(1) and emissions in BR exhibit increased jitter(1,18,34,35), assuming that this may be related to the theoretical conception of the tense VF, this would imply that the characteristic of vocal roughness would be generated by the rigidity of the muscular system(3,4,19), but research papers that investigate such relation could not be found.

However, some authors claim that the production of a tense VF is incorrect(1,29), remarking that the VF must be produced in a prolonged emission and performed effortlessly when in BR, also, it needs to be done after the almost complete exhalation, or during the inhalation for lower pitch frequencies(1,29), which sets the vocal fry, as the described above and not as a creaky voice that can occur at any frequency range, since that the VF emitted with excessive muscular tension presents a more acute fundamental frequency(1,29).

The use of the VF can be seen in the end of the emission of sentences of speakers without voice disorders(1,27), it also appears in the speech of some individuals in the form of mood demonstration, such as the decreasing inflection of sadness, boredom, level of fatigue or incorrect use of the laryngeal system(1,27,28,38), and also as a speech resource in broadcasting to build up the modeled vocal stereotype of "cheap seduction"(1).

It is a voice quality that often appears in relaxed voices, indicating low pressure, or in strained voices, showing expressions of surprise, admiration or suffering(27).

However, many authors state that the persistent use of the vocal fry in speech is usually considered harmful and should be discontinued, as it represents hyperfunctional vocal behavior that can lead to voice disorder(1,21,28), also, daily communication requires greater volume and projection, which is impractical in this type of emission(1,3,4).

Individuals who use VF in their day-to-day speech tend to build up a lot of vocal tension in the attempt to raise the intensity of their voices, which usually is reduced in this type of emission(1,3,4). To disable the persistent use of the VF in usual speech, it is suggested the direct change in the vocal frequency of the patient's phonation(1,3,4).

The constant use of VF in speech may present itself as a clinical type of psychogenic dysphonia, defined as dysphonia by fixation in BR. Although not common, this type of voice disorder can be observed in male young adults in need of self-affirmation or emphasized aggressive features(1).

It was found that patients with laryngeal contact granuloma who presented abusive behavior of the voice, evidenced a lower voice pitch, monotonous voice and excessive use of the VF during normal speech(39).

In addition, the voice crackling may represent a form of long-term vocal instability evident in certain neurological disorders(10).

Vocal fry as a speech-language pathology technique

Although the vocal fry is studied as an emission in Basal Register, it has been introduced for therapeutic purposes(40) for more than 20 years, indicated in the treatment of several voice disorders(19,29,31) such as: psychogenic illnesses (conversion falsetto, incomplete voice change, mutational falsetto, prolonged voice change, delayed voice change), muscle tension dysphonia (laryngeal isometry, vocal fatigue, uncomfortable phonation, mid-posterior triangular gap); organic dysphonia (nodules, polyps, cysts, edema and thickening of vocal folds) hypernasality, cleft palate, velopharyngeal insufficiency or incompetence, spasmodic dysphonia, ventricular phonation, vertical partial laryngectomy; palatomia; presbylarynx; paralysis of the larynx, cases where it was observed acute vocal pitch and tension in the vocal tract, or in the monitoring of the laryngeal balance, when the individual can easily enter the VF, and even in cases of dysphagia(1,3-6,9,16,19,21,29-33,37).

Even though there is still no published scientific evidence of the physiological and acoustic characteristics of the two proposed ways of implementing the VF, the authors of this theoretical conception suggest the use of the relaxed VF in cases of hyperkinetic dysphonia with difficulty in controlling the airflow, aphonia or dysphonia conversions and functional delays of voice change, while indicate the use of tense VF in cases of severe glottal insufficiency due to unilateral laryngeal paralysis, sulcus vocalis, and abduction, spasmodic dysphonia aiming for a better glottic closure and vocal stability(3,4,16,19).

On the other hand, other authors state that, when practiced in a tense way, and therefore more acute, the VF must be corrected immediately, not being indicated to persist in the use of the vocal fry in cases of hyperkinetic dysphonia in order to avoid increasing the tension(1,29). With the purpose of facilitating its production, low pitch sounds, yawning and neck exercises could be used(29).

Still, it is suggested to use the VF in the early treatment of vocal fold nodules only as a sound facilitator, because in addition to providing limited tour of the mucosa, it request actively the contraction of the TA muscle that is highly fatigable(3,4,19).

According to the researched literature, the vocal fry causes: great contraction of the TA muscle, shortening it; relaxation of the cricothyroid and the posterior cricoarytenoid; greater glottal closure; anterior-posterior constriction of the laryngeal vestibule; constriction of the nasopharynx; soft palate elevation; disclosure of the Passavant ridge; mobilization and relaxation of the mucosa, increasing its mass; improved glottal closure; reduction of phonatory tension with increased comfortable phonation after exercise; increase of the oral component of the resonance; decreased hypernasality; improved loudness; hoarseness improvement; better direction of the airflow upwards the oral cavity; increasing of the quotient of contact between the vocal folds; decrease in fundamental frequency; improves the noise-harmonic ratio; higher number of harmonics; and greater uniformity and distribution of frequencies in the spectrogram; increased energy of the vocal spectrum; more regular audio signal and increase in the maximum phonation time(1,3,9,19,21,26,28,29,32-34,37,38).

In a recent survey it was found that in case of gaps, despite the improvement in glottic closure with the use of VF, there may be increased measures of noise, jitter and worsening of vocal quality(21).

It is stated that the practice of exercises with lower-frequency sounds, which occurs with the VF, stimulates the production of mucous secretion from the glands of Morgagni's ventricle, allowing increased lubrication of the epithelium of the vocal folds(3,4,19).

We emphasize the importance of proper implementation and enforcement of the VF, in strict accordance with each pathology, because the inappropriate duration or manner of its execution may lead to the decrease of the fundamental frequency that the patient normally uses in spontaneous conversation and overload the TA, leading to fatigue(3,4,7,19).

It is also stated that performing exercises in the VF frequency or in a very low frequency range, requires an intense effort from the TA. This may not be the ideal condition for the muscles if they weren't previously stretched which is possible to achieve through the realization of smooth falsetto(4,19).

When the exercise is performed with excessive effort, speed or load, lactic acid (toxin with muscle enhancement function) accumulates in the muscles, limiting their performance(20).

Based on the theory here exposed, it is suggested not to use the VF as the last exercise of the vocal warm-up sequence for voice professionals or as the last exercise session in patients with conditions in which the reduction of the fundamental frequency is not the goal due to the setting of low a fundamental frequency, which can be kept in the speech in the modal register too, causing fatigue(3,4,7,19). Also, it is not recommended to use it as the first exercise in these situations because of the need for prior stretching of the TA before the VF practice(4).

Stretching, when practiced regularly, increases muscle strength and elasticity. The motor responses to stretching balance the muscle tension; improve the elasticity and length of the muscles; organize the posture, and enhance the development of the gesture engine and proprioception(20).

As for the running time of the VF, there was a positive effect on the vocal quality of adult subjects and in surgically repaired patients with post-foramen cleft lip, after three minutes of the VF technique(33).

In a study with patients undergoing vertical partial laryngectomy, the VF held for two minutes allowed greater vibration and approach of the remaining supraglottic structures, with a perceptual vocal improvement(31).

However, other authors found a negative effect after three minutes of execution of the VF in adults without changes in the phonetic apparatus, probably due to muscle overload, reaching better results with one minute of practice(38).

Adult women with hourglass chink(21) and adult women with normal larynx(32) performed the VF in three sets of 15 repetitions in their maximum phonation time, with 30 seconds of total silence or passive rest between sets(14), presenting positive results on the larynx and voice.

Still, considering that the VF is predominantly produced by the action of the TA muscles, The isometric exercise of the external TA(1,3-5,11,14), through the VF, can be accomplished by keeping the maximum contraction for six seconds, with daily repetition of five to ten times(11,14). Thus, for isometric contraction, which exercises mainly the external TA, the VF should be kept at the lowest possible frequency for the subject (maximal muscle contraction), holding this emission as described above, which would be compatible with the TA's predominance of fast twitch fibers(1,3-7,14,18,19,20,23).

The isotonic contraction, compatible with the predominance of slow twitch fibers in the internal TA(3-5,18,24), involves resistance to movement, occurring muscle tension during movement, shortening (concentric contraction) and stretching (eccentric contraction) the muscle(1,3-5,9), it is suggested the practice of eight to 12 repetitions of the movement in several series per day(11,14). In this case, for stretching the TA, it would be necessary to use high-pitched sounds(4), in other words, for isotonic activation through the internal TA exercising, it should be used VF emissions alternating with emissions in the head modal register or falsetto register (high-pitched sounds)(4,19), with the maximum number of repetitions, as the literature states, been that way, consistent with the prevalence of fatigue-resistant ST fibers(3-5,12,18,24). In any way, with the increase of the load, speed, or frequency of exercise, the result is increased muscle resistance, named endurance, due to the conditioning of the muscles involved(12,20).


According to most authors(1,3-11,14-19,23,25,26,29,34,35), the TA forms the body of the vocal folds and it's divided functionally into two beams, an internal one and an external one. The external TA is rich in FT fibers, and is highly fatigable(1,3-7,14,23), with an isometric contraction(1,3-5,11,14), it's predominantly adductor, been related to the stability of the glottal closure in higher levels of vocal intensity(1,3-7,26).

The contraction of the internal TA increases the glottal strength and resistance, regulating vocal intensity(3-5,19,25); increases the mass of vocal folds, decreasing their length and causing the accumulation of mucous mass that coats them(1,3-5,9) and produces low frequency sounds, because it's considered that the internal TA is the tensor for bass sounds(1,3-9,19). This muscle is rich in ST fibers of isotonic contraction type, being more resistant than the external TA(23,25).

The literature characterizes the basal register by the perception of the glottal vibration pulses(1,19,27-30) during the emission of the lowest frequencies of the entire vocal range (crackling in bass, also called vocal fry)(1,27-29) due to a very evident shortening of the TA and a firmer glottal closure , having the mucosa loose and in bulk along the free edge, occurring a reduction in the ventricular space in reason of the approximation between the ventricular folds(1,3,4,6,7,9,18,19,21,30,31). Thus, the subglottic pressure is increased and the airflow is reduced(1,18,19,26,29,32-35). Emissions in this register show high levels of jitter, shimmer and noise when compared to the modal register commonly used in spontaneous speech(1,18,19,32-35). However, there is no consensus regarding the definition of "vocal register"(1,3,4,18,19,26,27), in relation to the exact range of frequencies produced in BR(1,3,4,6, 7,9,18,19,21,27,29,30,32-35,38), nor on the number of glottic pulses that occurs per vibration cycle in this register(1,3,4,6,7,9,18,19,21,27,28,30-36) .

The VF, a speech-language pathology technique consisted of continuous emissions in basal register during speech, syllables or vowels(1,3-5,9,16,18,19,21,27-30), is mainly produced by the action of the TA muscle(1,3-9,18,19,29,34,35), presenting therapeutic indications that range from the control to the breathing level, passing through the glottic closure and mobilization of the mucosa, until the closure of the velopharyngeal sphincter, showing significant improvement in voice quality, in laryngeal conditions and vocal nodules(1,3-6,9,16,19,21,26,28-34,37,38) .

Despite the several indications for the use and effects reported, there is no consensus on the runtime of VF(11,14,21,31-33,38).

However, considering that the VF is produced by the predominant action of the TA muscles(1,3-9,18,19,29,34,35) and due to the anatomophysiological characteristics of the muscle described in the literature(1,3-11,14-19,23,25,26,29,34,35) , it is believed that for isometric exercise of the external TA, the VF should be kept in the lowest possible frequency to the subject (maximal muscle contraction), supporting it for six seconds, five to ten times a day, in accordance with the compatibility of the FT fibers features(1,3-7,11,14,18,19,20,23).

Still, as stated by the literature, it is believed that for an isotonic practice of the internal TA, it would be necessary to use high sounds to stretch it, that is, emissions in VF (concentric contraction) should be alternated with emissions in modal register (eccentric contraction) or head falsetto (high-pitched sounds) with eight to 12 several daily series of repetition, compatible with the characteristics of the ST fibers(1,3-5,9,11,12,14,18,19,24).

Due to the limited number of studies on the variables: form of practice of the VF; anatomophysiological and acoustic characteristics of the emission in VF; runtime of the technique; features of the studied population, anatomophysiological characteristics of the TA muscles, and also, about the relationship between these variables, it is of supreme importance the continuation of studies in this area to enable the clinical application of VF with increasingly strong evidence.


The internal beam of the TA presents isotonic, fatigue resistant, slow twitch fibers, while the external beam has fatigable, isometric, fast twitch fibers. The VF is characterized by the perception of the glottal vibration pulses during the emission of low frequencies in the vocal range (crackling in bass or vocal fry), mainly by the action of the TA, especially its inner portion, which gets shortened in an evident way, dropping the mucosa in great volume along the free edge, increasing subglottic pressure and the levels of jitter, shimmer and noise, and reducing the airflow as well.

The isometric practice of the external TA takes place with a sustained VF in the lowest possible frequency to the subject (maximal contraction) for six seconds, in 5-10 daily repetitions. In the isotonic practice of the internal TA, it would be used high-pitched sounds to stretch it, alternating between emissions in VF (concentric contraction) and in modal register and falsetto head (high-pitched sounds) (eccentric contraction), with several daily series of eight to 12 repetitions.


  • 1. Behlau MS. Voz: o livro do especialista. v. 1. Rio de Janeiro: Revinter; 2005.
  • 2. Elliot N, Sundberg J, Gramming P. Physiological aspects of a vocal exercise. J Voice. 1997;11(2):171-7.
  • 3. Pinho SM. Tópicos em voz. Rio de Janeiro: Guanabara Koogan; 2001.
  • 4. Pinho SM. Fundamentos em fonoaudiologia: tratando os distúrbios da voz. Rio de Janeiro: Guanabara Koogan; 2003.
  • 5. Pinho SM, Tsuji DH, Bohadana SC. Fundamentos em laringologia e voz. Rio de Janeiro: Revinter; 2006.
  • 6. Denk DM, Swoboda H, Steiner E. [Physiology of the larynx]. Radiologe. 1998;38(2):63-70. German.
  • 7. Camargo Z. Da fonação à articulação: princípios fisiológicos e acústicos. Fonoaudiol Bras. 1999;2(2):14-9.
  • 8. Monfared A, Gorti G, Kim D. Microsurgical anatomy of the laryngeal nerves as related to thyroid surgery. Laryngoscope. 2002;112(2):386-92.
  • 9. Carrara-Angelis E, Behlau MS, Pontes PA, Tosi O. Comparative analysis of laryngeal configuration, perceptual auditive and spectrographic acoustic of vocal quality before and after emission in vocal fry. Folia Phoniatr Logop. 1992;44(1):1-16.
  • 10. Gazi FR, Felix GB, Brasolotto AG. Características vocais de indivíduos pós-traumatismo crânio-encefálico. Distúrb Comun. 2004;16(3):323-31.
  • 11. Ferrao ML, Fernandes Filho J, Fortes MS, Viana MV, Dantas EE. Efeito da predominância de tipo de fibra muscular sobre o emagrecimento e condicionamento aeróbico. Fit Perform J. 2004;3(4):231-5.
  • 12. Carter SL, Rennie CD, Hamilton SJ, Tarnopolsky A. Changes in skeletal muscle in males and females following endurance training. Can J Physiol Pharmacol. 2001;79(5):386-92.
  • 13. Barroso R, Tricoli V, Ugrinowitsch C. Adaptações neurais e morfológicas ao treinamento de força com ações excêntricas. Rev Bras Ciênc Mov. 2005;13(2):111-22.
  • 14. Saxon K, Schneider CM. Vocal exercise physiology. San Diego: Singular Publishing; 1995.
  • 15. Shindo ML, Herzon GD, Hanson DG, Cain DJ, Sahgal V. Effects of denervation on laryngeal muscles: A canine model. Laryngoscope. 1992;102(6):663-9.
  • 16. Mangilli LD, Amoroso MR, Nishimoto IN, Barros AP, Carrara-de-Angelis E. Voz, deglutição e qualidade de vida de pacientes com alteração de mobilidade de prega vocal unilateral pré e pós-fonoterapia. Rev Soc Bras Fonoaudiol. 2008;13(2):103-12.
  • 17. Sennes LU, Tsuji D, Bodahana S, Bento RF, Ribas GC. O uso de imagens tridimensionais no ensino da anatomia da laringe. Arq Int Otorrinolaringol. 2000;4(3):92-100.
  • 18. Blomgren M, Chen Y, Ng ML, Gilbert HR. Acoustic, aerodynamic, physiologic, and perceptual properties of modal and vocal fry registers. J Acoust Soc Am. 1998;103(5 Pt 1):2649-58.
  • 19. Cronemberger FF. Considerações teóricas sobre o vocal fry [monografia]. Salvador: CEFAC; 1999.
  • 20. Mello EL, Andrada e Silva MA. O corpo do cantor: alongar, relaxar ou aquecer? Rev CEFAC. 2008;10(4):548-56.
  • 21. Bolzan GP, Cielo CA, Brum DM. Efeitos do som basal em fendas glóticas. Rev CEFAC. 2008;10(2):218-25.
  • 22. Lovetri J, Lesh S, Woo P. Preliminary study on the ability of trained singers to control the intrinsic and extrinsic laryngeal musculature. J Voice. 1999;13(2):219-26.
  • 23. Hoh JF. Laryngeal muscle fibre types. Acta Physiol Scand. 2005;183(2):133-49.
  • 24. Dailey SH, Kobler JB, Zeitels SM. A laryngeal dissection station: educational paradigms in phonosurgery. Laryngoscope. 2004;114(5):878-82.
  • 25. Guida HL, Zorzetto NL. Morphometric and histochemical study of the human vocal muscle. Ann Otol Rhinol Laryngol. 2000;109(1):67-71.
  • 26. Björkner E, Sundberg J, Cleveland T, Stone E. Voice source differences between registers in female musical theater singers. J Voice. 2006;20(2):187-97.
  • 27. Ishi CT, Sakakibara KI, Ishiguro H, Hagita N. A method for automatic detection of vocal fry. IEEE Trans Audio Speech Lang Processing. 2008;16(1):47-56.
  • 28. Machado LP. Análise comparativa da constrição da parede nasal da faringe em registro modal e basal [monografia]. São Paulo: Universidade Federal de São Paulo; 1996.
  • 29. Behlau M, Madazio G, Feijó D, Azevedo R, Gielow I, Rehder MI. Aperfeiçoamento vocal e tratamento fonoaudiológico das disfonias. In: Behlau M. Voz : o livro do especialista. v.2. Rio de janeiro: Revinter; 2005. p.409-564.
  • 30. Chen Y, Robb MP, Gilbert HR. Electroglottographic evaluation of gender and vowel effects during modal and vocal fry phonation. J Speech Lang Hear Res. 2002;45(5):821-9.
  • 31. Serrano DM, Suehara AB, Fouquet ML, Gonçalves AJ. Uso do som crepitante grave, modelo vocal fry nas laringectomias varciais verticais. Distúrb Comun. 2005;17(1):19-25.
  • 32. Brum DM. Modificações vocais e laríngeas ocasionadas pelo som basal [dissertação]. Santa Maria: Universidade Federal de Santa Maria; 2006.
  • 33. Elias VS. Eficácia do som basal no fechamento do esfíncter velofaríngeo [dissertação]. Santa Maria: Universidade Federal de Santa Maria; 2005.
  • 34. Conterno G, Cielo CA, Elias VS. Características vocais acústicas do som basal em homens com fissura pós-forame reparada. Rev CEFAC. 2011;13(1):171-81.
  • 35. Slifka J. Some physiological correlates to regular and irregular phonation at the end of an utterance. J Voice. 2006;20(2):171-86.
  • 36. Conterno G, Cielo CA, Elias VS. Fissura palatina reparada: fechamento velofaríngeo antes e durante o som basal. Braz J Otorhinolaryngol. 2010;76(2):185-92.
  • 37. Childers DG, Ahn C. Modeling the glottal volume-velocity waveform for three voice types. J Acoust Soc Am. 1995;97(1):505-19.
  • 38. Sarkovas C, Behlau MS. Avaliação perceptivo-auditiva e eletroglotográfica de efeitos dos exercícios: som basal e sopro som agudo, em fonoaudiólogas. In: 13º Congresso Brasileiro de Fonoaudiologia; 2005 Set; Santos,SP.
  • 39. Ylitalo R, Hammarberg B. Voice characteristics, effects of voice therapy, and long-term follow-up of contact granuloma patients. J Voice. 2000;14(4):557-66.
  • 40. Boone DR, McFarlane SC. The voice and voice therapy. 4th ed. Englewood Cliffs: Prentice Hall; 1988.

  • Thyroarytenoid muscle and vocal fry: a literature review
    Carla Aparecida CieloI; Vanessa Santos EliasII; Débora Meurer BrumI; Fernanda Vargas FerreiraIII

Publication Dates

  • Publication in this collection
    28 Oct 2011
  • Date of issue
    Sept 2011


  • Accepted
    14 Dec 2010
  • Received
    09 Jan 2010
Sociedade Brasileira de Fonoaudiologia Al. Jaú, 684 - 7º andar, 01420-001 São Paulo/SP Brasil, Tel.: (55 11) 3873-4211 - São Paulo - SP - Brazil