SciELO - Scientific Electronic Library Online

vol.71 issue3“Speaking properly”: Language Conceptions Problematized in English Lessons of an Undergraduate Teacher Education Course in BrazilThe Perception of Voice Quality by Brazilian Bilinguals author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand




Related links


Ilha do Desterro

Print version ISSN 0101-4846On-line version ISSN 2175-8026

Ilha Desterro vol.71 no.3 Florianópolis Sept./Dec. 2018 


Improving L2 Pronunciation Inside and Outside the Classroom: Perception, Production and Autonomous Learning of L2 Vowels

1Universitat Internacional de Catalunya, Barcelona, Catalunya, España

2Universidade Federal de Santa Catarina, Florianópolis, Santa Catarina, Brasil


Spanish/Catalan learners of English as a Foreign Language (EFL) attended a formal instruction (FI) period combined with explicit pronunciation instruction, which consisted of theoretical and practical approaches to English segmental phonetics as well as a contrastive analysis between the participants’ first languages and the target language, English. The EFL learners’ ability to perceive and produce L2 vowels was assessed before and after the 8-week instructional treatment. Results show that the EFL learners significantly improved their perception of vowel sounds embedded in real and non-words. However, no improvement in production was found as a result of the instruction received. While these results suggest that learners’ perceptual skills can be improved with teacher-led instruction, the quantity and availability of explicit pronunciation instruction was not sufficient to modify learners’ speech production. Thus, optimal results require learners to continue learning outside the classroom context. With this aim, this paper presents two autonomous activities developed to increase learners’ awareness about phonology: L1-L2 Pronunciation Comparison Task and a Phonological Self-awareness Questionnaire.

Keywords: L2 pronunciation; formal instruction; autonomous learning


Pronunciation is a crucial component of the learning of oral skills in a second language (L21) and according to previous research, oral skills and pronunciation rely on the exposure to good quality input in order to be successfully enhanced (Celce-Murcia et al., 1996; Flege, 1991; Long & Larsen-Freeman, 1991). This might result problematic in the foreign language (FL) setting where authentic input (i.e. native/native-like) tends to be scarce (Celce-Murcia et al., 1996; Muñoz, 2008; Saito, 2015) and where learners have limited opportunities for interaction with the target language (TL). According to Larson-Hall (2008), FL instruction comprises minimal input conditions and is usually delivered in no more than four hours of instruction per week. Therefore, the quantity of language input received by learners acquiring a language in these contexts is usually restricted to the teachers’ instruction, which might partly be delivered in the learners’ native language (Muñoz, 2008). Muñoz (2011) and Derwing and Munro (2015) highlight the need to provide learners not simply with additional exposure but also with greater quality of exposure inside and outside the classroom. Authors (2015) found that Portuguese EFL learners outperformed Catalan learners of English when identifying and discriminating English voiceless stops, despite the comparable amount of years of formal instruction between the two groups and the similarity of the L1s. The authors speculated that, these outcomes might be attributed to the greater quantity, and possibly quality (i.e. authentic), of input received by the Portuguese learners outside the classroom. This might be connected to the fact that Portuguese learners are exposed to native English input through TV programs and films on a regular basis rather than being generally exposed to foreign shows and films dubbed in the local language, as is the case in Spain. However, further research with more controlled measures would have to be carried out in order to confirm this issue.

Consequently, previous studies investigating the effect of formal instruction (FI)2 on oral skills and/or pronunciation in the EFL context have failed to provide evidence that the input found in the EFL setting can alter these domains (Fullana & Mora, 2009; García-Lecumberri & Gallardo del Puerto, 2003; Monje-Sangüesa, 2016). One noteworthy reason for this lack of success lies in the fact that English pronunciation has been described as one of the most challenging skills to be learned (Calvo Benzies, 2014; Moyer, 1999; Scovel, 1988). In fact, Moyer (1999) explains that whilst learners may achieve near native-like abilities in other aspects of a foreign language (i.e., grammar and/or lexis); learners rarely achieve native-like abilities in pronunciation, even after years of experience with the target language. As a consequence, most adult EFL learners are likely to speak with a foreign accent (Scovel, 1988).

Positive results of pronunciation enhancement as a result of FI have previously been reported in one of the four following scenarios: i) when the amount of FI was considerably large (Saito, 2015), ii) when the FI was accompanied by corrective feedback (Saito & Lyster, 2011), iii) when the FI was accompanied by explicit pronunciation instruction (Gordon & Darcy, 2016; Kissling, 2013; Thomson & Derwing, 2014) and iv) when learners’ awareness of the cross-linguistic features between the L1 and the L2 had risen (Alves & Magro, 2011; Kennedy & Trofimovich, 2010; Silveira & Alves, 2009; Wrembel, 2005). This last scenario might be a consequence of any of the three previous ones (i, ii or iii) or even, a consequence of autonomous learning. Topics iii and iv, explicit pronunciation instruction and raising language learners’ phonological awareness, will be further explored in the review of literature, since they are directly connected with the objectives of this paper.

The present paper builds on two theoretical frameworks: Schmidt’s Noticing Hypothesis (1990) and Flege’s Speech Learning Model (1995). Schmidt (1990; 1995) postulates that learning cannot take place without the learner consciously noticing the target feature. Applied to the field of L2 speech learning, noticing the phonological form is necessary for its acquisition. Moreover, Schmidt (1990) as well as other researchers (e.g. Ellis, 2005; Long 1991) advocate that noticing can be enhanced with explicit instruction. The Speech Learning Model (SLM), on the other hand provides specific predictions on how L2 learner can come to notice L2 sounds. Namely, the L2 learner is more likely to notice, and consequently form categories, for L2 sounds that are perceived as dissimilar to the L1 sounds. On the other hand, the learner is more likely to have difficulties in noticing differences between L1 and L2 vowels that are perceived as similar (equivalence classification).

The objective of the paper is to examine the effectiveness of FI combined with explicit pronunciation instruction on developing EFL learners’ perception and production on L2 vowels and to present consciousness-raising pronunciation activities which could be employed by the language learners autonomously in order to complement the benefits of explicit pronunciation instruction. The paper is structured as follows. We will begin by discussing some previous studies on explicit pronunciation instruction and the employment of consciousness-raising activities in the L2. We will then present the results of a study examining the effect of explicit pronunciation instruction on L2 vowel production and perception by looking into the objectives, participants, method, procedure and the results. We will then turn to presenting two pronunciation activities developed to raise EFL learners’ awareness about L2 pronunciation. Finally, we will discuss the role of explicit pronunciation instruction and consciousness-raising in L2 pronunciation teaching and put forward some suggestions for further research.

1. Literature review

In this section we will review some recent studies on the effectiveness of explicit pronunciation instruction and the role of phonological awareness in the FL classroom.

1.1 Explicit pronunciation instruction

Gordon & Darcy (2016) suggest that explicit L2 pronunciation teaching would be necessary in order to enhance EFL learners’ oral abilities. In fact, despite the discouraging results of some early studies on the effect of pronunciation instruction on EFL learners’ production of L2 sounds (Purcell & Suter, 1980), recent research evaluating the effect of pronunciation instruction in improving the perception and/or production of L2 target sounds has reported positive results (Gordon & Darcy, 2016; Kissling, 2013; Saito, 2012; Thomson & Derwing, 2014). For instance, Saito (2012) investigated the effects of a short period of L2 explicit pronunciation teaching (four hours) on the accentedness and comprehensibility of 20 native Japanese learners of English. Instruction targeted English L2 sounds that are commonly mispronounced by Japanese learners. Learners’ oral abilities were assessed immediately before and after the pronunciation instruction and they were judged by four native English speakers (NES). No significant reduction in foreign accentedness was observed as a result of the explicit instruction received. However, the learners’ comprehensibility, as perceived by the NES, had significantly improved.

In another study that investigated the effects of L2 pronunciation instruction, Gordon, Darcy & Ewert (2013) compared three groups of ESL learners differing on the type of pronunciation instruction received. One group received instruction on segmentals, another group received instruction on suprasegmentals and the third group was taught by a combination of both. Learners’ comprehensibility was assessed immediately before and after a 3-week instruction period by a group of 12 non-native English speakers (NNES). Results revealed that only the learners whose instruction focused on suprasegmentals were perceived significantly more comprehensible at post-test.

In a later study, Thomson and Derwing (2014) reviewed 75 different pronunciation studies and concluded that “pronunciation instruction is effective in improving the target form(s)” (p.7), since 82% of the studies reported a significant improvement as a result of the instruction received. Moreover, the authors add that “pronunciation research and instruction should be primarily concerned with helping learners become more understandable” (p.2).

In spite of the reported positive effects of pronunciation instruction in improving learners’ intelligibility and comprehensibility of L2 speech, pronunciation instruction has not been given the necessary importance in the FL classroom (Piske, 2008). According to Gilbert (2010), pronunciation still is the EFL orphan, since it is either completely neglected in the FL classroom or it is the language aspect given the least attention (Fraser, 2000). Setter and Jenkins (2005) add that neither pronunciation instruction nor training of perceptual and/or production abilities have a secure place in most language curriculums. That is so, since pronunciation instruction is often viewed as a complementary activity rather than an essential part of the EFL syllabus (Cenoz & García-Lecumberri, 1999).

Against this lack of attention given to pronunciation in the FL classroom, alternative avenues may be found in High Variability Phonetic Training (HVPT) and/or autonomous learning. On the one hand, HVPT, which provides learners with immediate feedback, might be a good complement to explicit pronunciation instruction in order to explicitly draw the learners’ attention to challenging L2 segments. Both perceptive and productive abilities might be enhanced by HVPT even after a short training regime (e.g., Author, 2017a; Authors, 2014a; Alves & Luchini, 2016; Bradlow, 2008; Logan & Pruitt, 1995; Rato & Rauber, 2015). On the other hand, the creation of autonomous pronunciation activities to learners might also be effective in facilitating L2 phonological learning. According to Holec (1981), autonomous activities promote the individualization of learning by allowing learners to control their own performance and gradual progress. Moreover, previous research indicates that the employment of consciousness-raising activities in FL classroom is beneficial for students’ language proficiency (e.g., White & Ranta, 2002), an issue which will be further discussed in the following section.

Raising language learners’ phonological awareness

Even though a large number of studies has shown the effectiveness of consciousness raising activities in FL learning, in the field of phonology, the employment of activities to raise awareness about L2 phonology has not been extensively studied. Phonological awareness is seen to form part of Language Awareness, which can be defined as “explicit knowledge about language, and conscious perception and sensitivity in language learning, language teaching and language use” (Association for Language Awareness, 2012). Extending this definition to the field of phonology, phonological awareness in the L2 context can be seen to consist of declarative and procedural knowledge.

Declarative knowledge about the L2 phonology would be explicit knowledge that the L2 learner can verbalize (metaphonetic awareness for Wrembel, 2011). Declarative knowledge about L2 phonology can be manifested in metaphonetic tasks such as manipulating L2 phones (Venkatagiri & Levis, 2007), visually and auditorily analyzing and comparing L2 and L1 pitch patterns (Ramírez Verdugo, 2006), elaborating metalinguistic journal entries about L2 pronunciation (Kennedy & Blanchet, 2014; Kennedy & Trofimovich, 2010) or commenting on one’s own pronunciation (Wrembel, 2011, 2013).

Procedural knowledge about the L2 phonology, on the other hand, would be intuitive knowledge that cannot be verbalized (phonetic/phonological sensitivity for Piske, 2008). It can be accessed through mimicry tasks (Authors, 2014b; Flege & Hammond, 1982), non-word reading (Venkatagiri & Levis, 2007), analysis of self-repairs (Wrembel, 2011, 2013) or perception tasks employing a time pressure, such as lexical decision or priming tasks (Author, 2017b).

Viewing phonological awareness in the L2 as consisting of both verbalizable and non-verbalizable knowledge explains why L2 speech learning might be a complex task. Whereas morphosyntactic and lexical acquisition are more susceptible to conscious learning efforts, the inherent nature of speech makes conscious noticing of phonological features difficult for L2 learners (Jilka, 2009). Furthermore, due to trade-offs between form and meaning, only the most proficient language learners whose attentional resources are no longer needed in deciphering the meaning of the message can focus on its form (VanPatten, 1996). Following Schmidt (1995, and elsewhere), if a (phonological) feature is not noticed, it will not be acquired. For these reasons, aiding learners to notice features of L2 pronunciation is essential.

Previous studies indicate that language learners who are more aware of their L2 pronunciation and who possess high L2 phonological awareness demonstrate more accurate L2 speech perception and production (Author, 2015; Baker & Trofimovich, 2006). Phonological awareness about the L2 can be developed through any activity that brings a specific aspect into the language learners’ consciousness (consciousness-raising; CR). Examples of CR activities in L2 speech learning are: explicit teaching of L2 phonology (e.g. Gordon & Darcy, 2016; Kissling, 2013; Saito, 2012), explicit comparing and contrasting of L1 and L2 phonology (Solé & Estebas, 2000), articulatory and/or perceptual training (e.g. Alves & Magro, 2011; Authors, 2014a), use of enhanced input and the use of feedback (e.g. Saito, 2013), to name but a few.

The vast majority of studies focusing on increasing language learners’ awareness about the L2 phonology has taken place in an instructional, teacher- or researcher-led setting. Some studies (e.g. Kennedy et al., 2014; Wrembel, 2011) have nevertheless employed CR-activities that with little or no instruction could be applied by the learners autonomously outside the classroom. Kennedy et al. (2014), Kennedy and Blanchet (2014) and Kennedy and Trofimovich (2010) taught learners to keep learning journals in order to write down reflections on pronunciation learning-related issues raised while they were attending a TL speaking and listening course. Wrembel (2011) recorded language learners and asked them to elaborate on some metaphonetic questions while the recordings were played back to them (stimulated recall). The results from these studies are positive for L2 speech development: learners seemed to become more aware of the target language phonology, which eventually might lead to more target-like perception and production.

As mentioned in the previous section, increasing learner autonomy is highly beneficial as becoming an autonomous learner means that learning is not limited to the classroom context. Teaching learners how to increase awareness about L2 phonology does not only positively reflect on their L2 pronunciation, but also enables them to take control of their pronunciation learning by developing self-monitoring abilities. This seems especially relevant for L2 speech acquisition when we take into account the already discussed issues of the quality and quantity of L2 input available for learners in the FL classroom (Derwing & Munro, 2015; Larson-Hall, 2008) and the inherent difficulty of noticing L2 phonology (Jilka, 2009; VanPatten, 1996).

Consequently, we part from the idea that the best results for L2 speech development may be obtained by combining explicit pronunciation teaching inside the classroom with helping the learners to continue raising their phonological awareness outside the classroom. Section 2 discusses an empirical study which aimed at improving learners’ perception and production of L2 vowels through explicit instruction. Section 3 presents two autonomous CR activities which were designed to help language learners to increase their phonological awareness outside the classroom.

2. Improving pronunciation inside the classroom

This classroom-based study aimed at further investigating the effects of an instructional treatment on the perception and production of L2 sounds. More specifically, this study sought to determine whether an 8-week FI period combined with explicit pronunciation instruction could positively affect the perception and/or the production of five standard Southern British English (SSBE) vowels /iː, ɪ, æ, ʌ, ɜː/ by Spanish/Catalan EFL learners.

The specific L2 vowel sounds were carefully selected on the basis of previous research that reports these vowels to be commonly mispronounced by native speakers of Spanish/Catalan (Cebrian, 2006; 2015; Rallo-Fabra & Romero, 2012). Regarding the vowel pair /i/-/ɪ/, their acquisition is problematic since neither Spanish nor Catalan has a comparable tense‐lax distinction. Catalan learners of English have been found to assimilate English /i/ to their L1 /i/ category, while the lax vowel /ɪ/ has been found to be less consistently categorized, being mapped onto Catalan /e/, and to a lesser extent to /i/ (Cebrian, 2006). The vowel pair /æ/-/ʌ/ is troublesome for these learners as there is no low front‐back distinction in Spanish or Catalan, both languages having only one low vowel, /a/. As a consequence, Catalan/Spanish bilinguals have been found not only to perceive, but also to produce both English /æ/ and /ʌ/ in terms of the existing L1 vowel category (Aliaga-García & Mora, 2009; Fullana & MacKay, 2003). Concerning the central vowel /ɜː/, Cebrian et al. (2011) report no clear match for this sound in Spanish/Catalan, being assimilated with low degrees of perceptual assimilation to the Catalan sounds /ɛ, e, o, ɔ/. Moreover, Cebrian (2015) found the SSBE central vowel /ɜː/ to be very dissimilar from Spanish and Catalan vowels and stated that “such vowel is a good candidate for accurate L2 categorization given enough exposure to the target language” (p. 4), in accordance with the SLM (Flege, 1995).

In light of the above, the study sought to investigate the following research questions and hypotheses.

RQ1: Does an 8-week FI period combined with explicit pronunciation instruction modify the perception of L2 vowel sounds by Spanish/Catalan learners of English?

H1: In line with previous research, it is hypothesized that an 8-week FI period combined with explicit pronunciation instruction will play a role in modifying L2 learners’ vowel perception (Gordon et al., 2013; Saito, 2015), due to the combination of TL exposure and the metalinguistic knowledge received.

RQ2: Does an 8-week FI period combined with explicit pronunciation instruction modify the production of L2 vowel sounds by Spanish/Catalan learners of English as perceived by native English Speakers.

H2: In line with previous research, it is hypothesized that an 8-week FI period combined with explicit pronunciation instruction will not suffice in modifying L2 learners’ production abilities (Fullana & Mora, 2007). Longer periods of FI would necessarily be undertaken in order to observe significant pronunciation gains (e.g. Saito, 2015).

2.1. Methods3

2.1.1 Participants

A group of sixteen Spanish/Catalan bilinguals learning English as an L2 made part of the cohort of the present study. All learners were second-year English majors at a state funded university in Barcelona and were enrolled in the third semester of an English Studies degree. Students were attending five compulsory courses, which were taught entirely in English4 either by native or proficient non-native instructors5. Importantly, all students were taking an introductory phonetics and phonology course at the time, which consisted of theoretical and practical approaches to English segmental phonetics as well as a contrastive analysis between the participants’ L1 and the TL, English. Further details on the introductory phonetics and phonology course and the English exposure learners were receiving are provided in section 2.1.4.

Three SSBE NES, who were currently living in the UK, made part of the study by providing baseline data and validating the materials. An additional group of four SSBE NES judged learners’ oral productions for comprehensibility. Due to practical reasons, the NES judges were currently living in Barcelona. They were all TEFL teachers, who had been in Barcelona for an average of 6 years at the time of the data collection. Tests were performed in a quiet room by making use of good quality headphones (SONY ZX310) connected to a laptop computer. Learners’ language proficiency was assessed through the Cambridge Online test (COT), which is an open source proficiency test.6 Table 1 presents demographic information about the participants.

Table 1 Demographic characteristics of the participants. 

n=16 Mean SD Range
Age 19.6 1.4 18-23
AOL 5.8 2.2 3-12
Years of English formal instruction 12.3 3.1 5-18
Time abroad (in months) 0.5 1.1 0-.3.5
COT grade (scale: 0-10) 7.6 2.2 3.7-10

As observed in Table 1, the mean score of the COT was 7.6 out of 10, which corresponds to an upper intermediate level of proficiency. Nevertheless, the results ranged from 3.7 to 10, meaning that the level of proficiency among the participants was not homogeneous. Out of the 16 participants, 10 participants were female and six were male. Their mean age was 19.6, ranging from 18 to 23. Learners’ first exposure to English ranged from three to 12 years, with a mean of 5.8. Moreover, the average number of years of English FI was 12.3, ranging from a minimum of five and a maximum of 18. Additionally, it is relevant to note that learners differed on their self-reported language dominance. That is, six learners claimed to be dominant in Catalan and ten learners reported being dominant in Spanish.7

2.1.2 Procedure and testing tasks

The L2 learners were tested previous to (T1) and after (T2) an eight-week period of FI (including pronunciation instruction (see section 2.1.4)) in English. Testing assessed learners’ ability to perceive and produce English vowels and aimed at assessing perceptual and/or production changes emerging as a result of the instructional treatment.

A seven-forced-choice (7FC) identification task assessed learners’ perception. The perceptual testing was delivered by means of the software TP 3.1 (Rauber, Rato, Kluge & Santos, 2012) and the task contained the five target vowel sounds /iː, ɪ, æ, ʌ, ɜː/ and two fillers /e, ɑː/ as response options. Participants were given the following instructions, “You will be hearing a series of non-words and some real words. Your task is to click on the sound you think that matches the word/non-word spoken”. Learners’ production abilities were assessed through a picture naming task, delivered through a PowerPoint presentation and monitored by the first author and a sound technician. The L2 production data was analysed by means of NES comprehensibility judgments. Comprehensibility in this study was understood as ‘the degree of difficulty the listener reports in attempting to understand the utterance’ (Munro & Derwing, 2001 p.454). Production was rated on a 9-point-likert scale, in which 1 meant “easy to recognize as the target sound” and 9 meant “difficult to recognize as the target sound”. The NES judges first identified the vowel sounds that they heard in a 7FC Identification task8, and subsequently rated the learners’ production by indicating the degree of difficulty to recognize the produced sound as the target sound. Each target vowel was recorded in two different words twice (2 x 5 x 2), totalling in 20 words per participant. Thus, 320 tokens were analysed. Two production measures were adopted: percentage of correct identification of target sounds by NES and the median of the comprehensibility rating scores. All stimuli were evaluated in one single session. Judges could hear each individual production twice and could not go back to previous stimuli and/or change their previous responses. A reliability analysis using an intra-class correlation coefficient (ICC) with a level of “absolute agreement” was conducted on the rating scores in order to assess whether the raters agreed with one another. The results of the reliability analysis revealed a robust rater-agreement (α=.883), confirming that the four judges agreed on the analysis of the tests performed.

Both tests were done at the speech laboratory at the university where the learners were studying. The perception test took place in a quiet computer room with individual computers and headphones, and the production test took place in a soundproof room at the same institution. At both testing times (T1 and T2), the production testing occurred before the perceptual testing in the attempt to minimize any carry-over effect from the perception task to the production task. The overall duration of the testing session (perception + production) ranged from 30-40 minutes with two time-controlled breaks of 30 seconds, and learners were given course credit for their participation. Prior to the assessment, three SSBE NES performed the perceptual ID task and obtained over 96% accurate performance, thus validating the stimuli. The instructional treatment took place at the same institution and will be described further in section 2.1.4.

2.1.3 Stimuli

Stimuli for the perception task consisted of CVC real and non-words containing the five target vowel sounds /iː, ɪ, æ, ʌ, ɜː/ produced by two SSBE speakers (a male and a female). They were both from and had spent most of their lives in the south of England and thus they spoke a homogeneous and standard variety of British English, fulfilling the requirement for selection. None of the talkers reported speaking any other languages fluently and/or having any knowledge or contact with Spanish/Catalan in their daily routines. Talkers were recorded within a period of a week and were paid for their participation. All participants reported having normal vision and hearing Recordings were carried out at the Speech, Hearing and Phonetic Science Department at University College London (UCL). Stimuli were elicited by means of a sentence reading task which made use of the following carrier sentence: “I say X, I say X now, I say X again”. The stimuli was composed by 30 CVC non-words, seven practice non-words and eight non-words fillers involving the vowels /e/ and /ɑː/, totalling 150 trials for non-words. Moreover, testing involved ten CVC real words and eight real words as testing fillers, totalling 56 trials for real words. The production task, on the other hand, elicited real word stimuli only. The picture naming task contained different pictures that were used to elicit participants’ production of the five target sounds. Each word was elicited twice from each participant. The stimuli lists for perception and production can be seen in Appendix 1.

2.1.4 Instructional treatment

As previously mentioned, learners were attending five courses, which corresponded to the third semester of their English Studies degree. The specific course list and the amount of hours corresponding to each subject per semester is described in Table 2. Since an entire semester at this given institution corresponds to approximately 16 weeks of FI, the present investigation assessed the effect of half of the total amount of the hours for each subject. In this paper, formal instruction is viewed as the exposure to the TL received during the classes, not necessarily including focus on form (FOF). As observed in Table 2, the total number of hours of FI is divided into two categories: teacher-led and supervised. Teacher-led hours are understood as lecturing time, whereas supervised hours are understood as hours that involve peer interaction with teacher supervision. On these bases, students would be exposed to 240 hours of input through the teacher-lead hours and 117 hours of peer interaction throughout the entire semester. Consequently, 120 hours of good quality input and 58.5 hour of classroom interaction are being evaluated in the present investigation9.

Table 2 Compulsory subjects during the third semester of the English Studies degree 

Compulsory subjects Hours of FI per semester
Teacher-led Supervised Total
History and Culture of the USA 50 25 75
Use of the English Language I 45 22 67
English Phonetics and Phonology I 45 20 65
English Grammar 50 25 75
Victorian Literature 50 25 75

The explicit pronunciation instruction examined in this study was administered through a compulsory subject titled ‘Phonetics and Phonology I’. In this specific subject, the instruction consisted of providing students with an introduction to phonetics and phonology, an articulatory description of the English sounds and a contrastive analysis between their L1s (Spanish/Catalan) and English. Moreover, some practical approaches, such as phonemic and phonetic transcription, reading practice and computer exercises in the speech laboratory were part of the course.

2.2 Results and Discussion

Participants’ perception and production of the target vowel sounds was assessed immediately before and after an instructional treatment, including FI and explicit pronunciation instruction. Perception results were calculated by obtaining the correct percentage scores in the forced choice identification task. Production results were assessed by means of NES identification and judgments of learners’ productions. Four different judges first identified learners’ productions of L2 vowels embedded in real words and then provided a comprehensibility rating in a 9-point likert scale. The perception results will be presented first, followed by the production results.

2.2.1 Performance on L2 vowel perception

Perception scores on non-word identification and real word identification were calculated at both testing times and are displayed in Table 3 and Figure 1.

Table 3 Mean percentage identification of non-words and real words at the two testing times. 

Time Non-words Real words
% SD % SD
T1 54.9 10.7 73.2 11.2
T2 59.3 12.1 81.5 10.5
Gain 4.4 8.3

Figure 1 Mean percentage identification of non-words and real words at two testing times. 

In order to assess whether the instructional treatment had a significant impact on the perception of real and/or non-words, a two-way RM ANOVA was carried out on the percentage scores. Time and Stimulus Type (non-words vs. real words) were explored as within-subject factors. Results revealed a significant effect of Time (F (1, 15) = 49.230, p =.000, partial η2 = .766), a significant effect of Stimulus Type (F (1, 15) = 95.234, p = .000, partial η2 = .864) and a non-significant Time*Stimulus Type interaction (F (1, 15) = 2.008, p = .177, partial η2 =.118).The effect of time may be accounted for by the higher scores obtained at T2 than at T1 and the effect of Stimulus Type may be accounted for by the higher scores obtained with real words than with nonsense words. The lack of interaction is likely due to the fact that both stimulus types (real vs. non-words) improved from T1 to T2 similarly.

Taken together, these results show that an 8-week period of FI combined with explicit pronunciation instruction was successful in altering learners’ perceptual skills of the five target sounds embedded in both non-words and real words. However, the target sounds were better identified when embedded in real words than in non-words from the outset, and learners were able to improve their ability to perceive the target sounds to a larger extent when embedded in real words (8.3% gain for real words and 4.4% for nonsense words).

The fact that learners identified the target vowel sounds better when embedded in real words than when embedded in non-words is a noteworthy outcome. This may indicate that learners found it easier to recognize the vowels when they were in words that they recognized and possibly heard during FI. This finding supports previous research indicating that lexical knowledge is essential for vowel category formation (Mora, 2005). It also relates to previous findings from lexical decision studies, which report that learners perform faster and more accurately when identifying real words than non-word stimuli (Vitevitch & Luce, 1999).

As to investigate the effect of FI combined with explicit pronunciation instruction further, the identification scores for each of the target vowels were examined at both testing times. T-tests were run on the percentage scores and the results can be seen in Table 4 and Figure 2.

Table 4 Identification performance for individual sounds embedded in real and non-words. 

Stimulus Type Vowel T1 (%) T2 (%) T-test result
æ 39.8 40.9 Non-significant
ɜː 39.1 41.4 Non-significant
Non-words ɪ 69.1 80.2 p < .05
67.4 64.6 Non-significant
ʌ 55.2 61.9 p < .05
æ 50.8 57.1 Non-significant
ɜː 67.9 85.9 p < .01
Real words ɪ 89.0 93.7 Non-significant
82.1 79.7 Non-significant
ʌ 71.1 81.2 Non-significant

As evidenced in Table 4, the vowels which were perceived the poorest from the outset were the vowels /ɜː/, /æ/ and /ʌ/. In turn, the vowels that were perceived the most accurately at the outset were the high front vowels, especially the lax one.This initial perception is striking since the SLM would predict that the lax high front vowel, being a dissimilar sound to the L1 inventory, would pose as much difficulty to Spanish/Catalan learners as the vowels /æ- ʌ/. Importantly, learners were able to improve their perception of two vowel sounds embedded in non-words significantly, namely /ɪ/ and /ʌ/. Interestingly, these results pattern in line with the predictions of the SLM model, as these two sounds correspond to “dissimilar sounds” from the learners’ L1 inventory. The SLM predicts that dissimilar sounds are strong candidates to accurate perception, provided that enough good quality input is received. However, the most dissimilar sound when comparing the L1-L2 vowel inventories (/ɜː/) was not perceived more accurately after the FI when embedded in non-words. This result is particularly interesting considering the low percentage obtained at both testing times and the fact that learners succeeded in improving their perception of the low-mid central vowel when embedded in real words (Table 3 and Figure 1).

2.2.2 Performance on L2 vowel production

As previously explained, two measures were applied for L2 vowel production analysis: SSBE NES identification of learners’ productions of L2 vowels embedded in real words and NES comprehensibility rating scores. The percentage of correct identification by the NES and the median rating scores for each token at T1 and T2 were calculated (Table 5.)

Table 5 Production performance with target vowel sounds. Percentage of correct identification and median ratings. 

Time Correct identification by NES Ratings by NES
% SD Median SD
T1 80.0 10.3 4.5 1.4
T2 83.7 13.1 4.2 0.8
Gain 3.7 -0.3

Table 6 Comprehensibility rating scores for each vowel (Scale: 1-9) 

Vowel T1 T2
Rating SD Rating SD
æ 4.7 2.8 4.1 2.4
ɜː 6.1 1.9 5.9 2.1
ɪ 4.8 2.6 4.4 2.6
5.1 1.9 4.9 2.1
ʌ 1.9 2.8 1.8 2.5

As observed in Table 5, 80% of the L2 sounds were correctly identified at the outset of the study. However, these tokens were poorly rated for comprehensibility (4.5 on a 1-9 scale). The percentage of correctly identified vowels increased numerically as a result of the FI (3.7%), albeit non-significantly, as evidenced by a paired t-test (t = -1.14, df = 15, p =.135, one-tailed). In turn, the rating scores experienced a slight non-significant decrease (0.3) from T1 to T2. These results indicate that the instructional treatment received in this study did not succeed in altering the comprehensibility of the learners’ production as perceived by NES.

The rating scores attributed to each individual vowel sound by the NES judges are shown in Table 6. Interestingly, the vowel rated for the highest comprehensibility was the central vowel /ɜː/ followed by the high front vowel /iː/. Recall that the central vowel was the only vowel in whose perception learners improved from T1 to T2, and that the tense high front vowel is the most acoustically similar vowel sound when comparing the L1 and L2 (Cebrian, 2006). In fact, the author describes the L1 high front vowel to be nearly identical to the L2 counterpart. In turn, the most poorly rated vowel was /ʌ/, which seems to pose difficulties to learners both in perception and production. This result goes in line with the predictions of the SLM, since this sound is non-existent in the learners’ L1.

The main objective of this study was to investigate the influence of formal instruction combined with explicit pronunciation instruction on the perception and production of five English vowels by Spanish/Catalan bilinguals majoring in English studies. Learners’ perception and production skills were assessed before and after a 8-week period of formal instruction and the analysis revealed a positive perceptual improvement as a result of a semester of formal instruction received, which contained explicit pronunciation instruction as well as additional transcription exercises. Participants mostly enhanced their perception of two vowel sounds from T1 to T2, namely /ɪ/ and /ʌ/. This improvement is probably accounted for by three main factors: the positive impact of explicit pronunciation instruction (Darcy et al., 2012; Kissling, 2013; Thomson & Derwin, 2014), since it entails focus on form (Long & Larsen-Freeman, 1991), and the consequent increase in phonological awareness of the two novel English sounds by the participants. According to Author (2015), explicit pronunciation instruction might enhance the awareness of the target sounds by making them more noticeable, and as a consequence, more accurate perception and/or production of the specific target sounds might occur.

No improvement was observed on the production of the L2 sounds as measured by NES judgements immediately after the instructional treatment, as measure in T2. This result suggests that a period of 8-week of FI, even when including explicit pronunciation instruction did not suffice in altering learners’ production abilities immediately. Possibly, a larger amount of exposure to the TL would be required in order to modify learners’ production of vowel sounds from T1 to T2. Thus, this study has provided further evidence to the fact that the quantity and/or quality of input received in the EFL context might not be enough to alter learners’ pronunciation (Muñoz, 2011). In the assumption that perception might lead production skills (Flege, 1995), the fact that learners were able to enhance their perception through FI might indicate that production changes might take place at a later phase. However, due to the lack of a delayed post-test in this study, this hypothesis cannot be confirmed.

Given the classroom-based nature of this study, explicit pronunciation instruction took place during the compulsory semester, which makes impossible to separate the two types of instruction. This might indicate that some gains in L2 perception found in the current study may be attributable to the L2 input received from general language classes or to the explicit pronunciation instruction received. It becomes difficult to assess whether pronunciation instruction or classroom interactions led to perceptual improvement. Ideally, the performance of a control group receiving no explicit pronunciation instruction would be assessed. However, due to fact that all subjects are mandatory in the third semester at this given institution, this scenario was not possible.

3. Improving pronunciation outside the classroom: raising learners’ phonological awareness

The results of the experimental study described in the preceding section suggest that being exposed to target language FI and explicit pronunciation instruction is not necessarily enough to result in substantial gains in the perception and production of L2 speech as far as short term effects are concerned. Taken into account the low amount of input language learners are exposed to in a FL instructional setting, learners should be encouraged to seek learning opportunities outside the classroom. Albeit relevant for improving all the areas of L2, this is especially true for the case of pronunciation due to the difficulty in noticing phonetic details in the speech stream. Noticing is a pre-requisite for learning (Schmidt, 1995), and a large body of research attests for the benefits of activities designed to promote noticing (CR activities). Moreover, learning outside the classroom develops learner autonomy. Autonomous learning increases motivation, makes learning more meaningful, and allows individualization and tracking of one’s learning (Holec, 1981).

With this in mind, the present section presents two tested and validated autonomous activities designed to raise language learners’ phonological awareness. The presented activities require very little or no guidance from the instructor and can be carried out by learners of any proficiency level and with no prior knowledge in phonetics and phonology. However, learners with background in phonetics and phonology will also benefit from the activities. Both activities are designed to increase noticing by raising learners’ awareness of their own pronunciation, and can be applied to both, segmental and suprasegmental levels.

The first proposed activity is an L1-L2 pronunciation comparison task. In this task, learners will first record themselves reading L2 sentences aloud and will subsequently compare their recording to native speaker productions of the same sentences. The ability to perceive differences between one’s own speech and the L2 auditory targets might lead the speaker to modify their speech to correspond to the targets more closely, facilitating the development of accurate articulatory and auditory representations, as hypothesized by Baker and Trofimovich (2006). Comparison can be carried out orally (think aloud) or in writing, and the learner should engage into as much detail as possible, listening to both samples several times and stopping when necessary. Lower-level learners could reduce the playback speed in order to facilitate the task (aural enhancement). When native speakers are not readily available, online learning materials (e.g. podcasts, videos) could be used alternatively.

Learners could be asked to focus on a specific feature (e.g. vowels) or to listen to the whole sample and elaborate on any features noticed. For example, in Author (forthcoming), Spanish/Catalan learners of English recorded L2 sentences with challenging phones. Participants were then divided into three groups that received different instructions for the L1-L2 pronunciation comparison. Learners in Group 1 were told to pay special attention to the vowels, learners in Group 2 were told to pay special attention to the consonants and learners in Group 3 were not given a specific focus and were asked to comment on any differences they could notice. The analysis of the groups’ performance hopes to shed light on which focus (narrow/ broad) would lead to more noticing.

Unless specifically required, the learners should not be required to use technical vocabulary but they should be allowed to explain in their own words, as declarative knowledge about phonology is not a requirement for L2 speech development and its use could be seen intimidating. That being said, learners with background in phonetics and phonology could benefit from employing the appropriate terminology or even trying to phonetically transcribe the productions. Furthermore, for students who are familiar with phonetics and phonology, using a speech analysis software and accompanying the auditory comparisons with visual spectrogram and waveform analysis could be interesting. Ramírez Verdugo (2006) aided L1-Spanish learners of English to analyse TL pitch contours auditorily and visually during a 10-week training program. Her findings showed that learners’ awareness about English intonation had risen as a result of the training. Moreover, learners were able to transfer the newly gained awareness to production, and manifested a more target-like prosodic performance.

Another task that can help language learners to increase their awareness of L2 phonology is to encourage them to reflect on pronunciation through think-aloud protocols or questionnaires, for example. Due to the very nature of speech, which unravels nonstop, and to the supremacy of meaning over form (VanPatten, 1996), language learners rarely stop to think about pronunciation unless they are asked to do so in a pronunciation instruction class, for instance. Foreign language learners, and native speakers, for that matter, are largely unaware of the movements of their articulators and the constant flow of stimuli their cognitive and auditory systems are processing. Asking language learners to contemplate their views on pronunciation, to think about sounds that are difficult to pronounce or to all the different ways a sentence can be uttered (taking into account regional and social variation, for example) is a valuable way to raise learners’ awareness about L2 phonology. In order to aid the learners, the instructor can provide some questions or a specific topic (e.g. regional vowel variants) for discussion or to employ an already existing questionnaire.

Previous research indicates that engaging in self-reflection can be beneficial for developing phonological awareness. Kennedy and Blanchet (2014) asked participants taking a 15-week course in connected speech to reflect upon their learning in written journal entries. Comments about language awareness were positively related to the participants’ listening comprehension. Wrembel (2011), on the other hand, employed think-aloud protocols, and asked participants questions about pronunciation while they were listening to recordings of their own speech. Participants reported that the activity helped them to become more aware of their own pronunciation and to monitor their speech.

In another study (Author, 2015), Brazilian Portuguese learners of English answered a phonological self-awareness questionnaire at the end of a testing session, which measured the learners’ phonological awareness. The learners were asked to provide their opinion on pronunciation-related statements and to evaluate their difficulty in perceiving L2 speech on a 5-point scale. The statements focused on self-perception (e.g. “I can hear I have a foreign accent when I speak in English”) and on the perception of L1-accented L2 speech (e.g. “There are some specific English sounds that are difficult for Brazilians”). The self-evaluations asked learners to estimate their abilities to notice and explain (cf. Schmidt, 1995: awareness at the level of understanding) speech phenomena. For example, learners were asked to think about how easy they find it to identify a regional accent, to determine whether a given intonation pattern is adequate in English or to explain why a given sound they hear is not pronounced in a target-like manner. Behaviour in this questionnaire was positively related to the participants’ L2 phonological awareness at the segmental level (r=.46, p<.001, See Author (2015) for further details), indicating that higher degrees of self-awareness were positively related to the learners’ overall knowledge about the L2 segmental domain. This questionnaire asked learners to choose from existing options, but the questions could easily be left open-ended, encouraging a more elaborate reflection. Furthermore, the studies reviewed in this section report on self-awareness on L2 pronunciation abilities as whole. Instructors of a course on segmental phonetics and phonology such as the course reported in Section 2.1.4 might wish to limit the scope and bring learners’ attention to the vowel and consonant sounds of the L2.

The aim of this section has been to provide some ideas on how phonological awareness can be fostered by language learners at home. Encouraging language learners to develop their phonological awareness outside the classroom through autonomous activities can be an effective way to aid the noticing of L2 speech features. Combined with FI and explicit phonetics instruction, this is likely to help learners with the aim of aiding noticing and helping them to reach optimal results in their L2 speech development.

4. General discussion

The present paper had two objectives. Firstly, it presented the results of an empirical study examining the effects of FI combined with explicit pronunciation instruction on the perception and production of five English vowels. The findings showed that only the learners’ perception skills improved immediately after the 8-week period of formal instruction and that the improved perception was not carried to production. In other words, the learners’ pronunciation of the target vowels was not more comprehensible, as judged by NES, after the testing period. Taken these results into account, the second objective of the paper was to present two autonomous activities that learners could employ outside the classroom in order to enhance their awareness about L2 vowels or L2 phonology in general.

The study reported in this paper is not without limitations, which suggest directions for further exploration. First, the lack of a control group is a limitation of this study. The fact that all students were enrolled in the same mandatory courses made it impossible to have a control group. Another limitation of the present study is the lack of a delayed post-test, which could have revealed production results at a later stage. This question will remain unanswered and it suggests that further research is needed. The fact that the NES judges were currently living in Spain and were TEFL teachers is also a limitation of this paper, since their familiarity with Spanish accented English might have played a role in their ratings. Furthermore, production data was analysed by means of NES ratings only. An acoustic analysis of the stimuli would be necessary to fully evaluate the effect of FI on production, since it might be possible that they have modified their F1 and F2 values. In addition, contrasting the acoustic data with the NES ratings would allow an analysis of what aspects of production play a more relevant role in NES perception. Another issue that might have affected the outcomes of the present investigation is the fact that the participants were enrolled in a phonetics and phonology course and thus acquired meta-linguistic knowledge during the FI regime. Ideally, it would be interesting to know the extent to which the knowledge of phonetics and phonology may have affected the results, and further test students without this knowledge.

Previous research suggests, on the one hand, that the quantity and quality of the L2 input might not be ideal in FL classroom setting (Derwing & Munro, 2015; Muñoz, 2008) and that noticing phonetic detail from the speech stream is challenging (Jilka, 2009), on the other hand. Consequently, we propose that obtaining optimal results in L2 speech learning would be best achieved by combining classroom-based learning with CR activities carried out autonomously outside the classroom. Whereas language learners can benefit from explicit instruction and the instructor’s expertise within the class time, promoting autonomous learning by aiding learners to increase noticing of phonetic detail on their own is likely to lead to superior results. Future research should compare the effects of classroom only vs. classroom and autonomous learning on pronunciation development in order to determine which method would lead to higher gains and which activities within and outside the classroom are the most beneficial.


We would like to thank the anonymous reviewers of this paper for their fruitful comments and suggestions.


Aliaga-García, C., & Mora, J. C. (2009). Assessing the effects of phonetic training on L2 sound perception and production. Recent research in second language phonetics/phonology: Perception and production. Cambridge Scholars Publishing, 2-31. [ Links ]

Alves, U. K., & Magro, V. (2011). Raising awareness of L2 phonology: Explicit instruction and the acquisition of aspirated /p/ by Brazilian Portuguese speakers. Letras de Hoje, 46, 71-80. [ Links ]

Alves, U. K., & Luchini, P. L. (2016). Percepción de la distinción entre oclusivas sordas y sonoras iniciales del inglés (LE) por estudiantes argentinos: Datos de identificación y discriminación. Lingüística, 32(1), 25-39. doi:10.5935/2079-312x.20160002 [ Links ]

Baker, W., & Trofimovich, P. (2006). Perceptual paths to accurate production of L2 vowels: The role of individual differences. International review of Applied Linguistics in Language Teaching, 44, 231-250. [ Links ]

Bradlow, A. R. (2008). 10. Training non-native language sound patterns: Lessons from training Japanese adults on the English /r/ - /l/ contrast. Studies in Bilingualism Phonology and Second Language Acquisition, 287-308. doi:10.1075/sibil.36.14bra [ Links ]

Calvo-Benzies, Y. J. (2014). The teaching of pronunciation in Spain: students’ and teachers’ views. In T. Pattison (Ed.), IATEFL 2013 Liverpool Conference Selections (pp. 106-108). Faversham, UK: IATEFL. [ Links ]

Carlet, A. (2017). L2 perception and production of English consonants and vowels by Catalan speakers: The effects of attention and training task in a cross-training study. Unpublished doctoral dissertation. Universitat Autònoma de Barcelona, Barcelona, Spain. [ Links ]

Carlet, A., & Rato, A. (2015). Non-native perception of English voiceless stops. In E. Babatsouli & D. Ingram (Eds.), Proceedings of the International Symposium on Monolingual and Bilingual Speech 2015 (pp. 57-67). Chania, Greece: Institute of Monolingual and Bilingual Speech. [ Links ]

Cebrian, J. (2006). Experience and the use of non-native duration in L2 vowel categorization. Journal of Phonetics, 34(3), 372-387. doi: 10.1016/j.wocn.2005.08.003 [ Links ]

Cebrian, J., & Carlet, A. (2014). Second-language learners’ identification of target-language phonemes: A short-term phonetic training study. Canadian Modern Language Review, 70(4), 474-499. doi:10.3138/cmlr.2318 [ Links ]

Cebrian, J., Mora, J.C. & Aliaga-García, C. (2011). Assessing crosslinguistic similarity by means of rated discrimination and perceptual assimilation tasks. In Wrembel, M., Kul, M., Dziubalska-Kołaczyk, K. (Eds.), Achievements and perspectives in the acquisition of second language speech: New Sounds 2010 (pp. 41-52). Frankfurt am Main: Peter Lang. [ Links ]

Cebrian, J. (2015). Reciprocal measures of perceived similarity. In the Scottish Consortium for ICPhS 2015 (Ed.), Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK: Glasgow Universiy. [ Links ]

Celce-Murcia, M., Brinton, D., & Goodwin, J. (1996). Teaching pronunciation. Cambridge: Cambridge University Press. [ Links ]

Cenoz, J., & García-Lecumberri, M. L. (1999). The acquisition of English pronunciation: learners’ views. International Journal of Applied Linguistics, 9(1), 3-15. doi:10.1111/j.1473-4192.1999.tb00157.x [ Links ]

Couper, G. (2011). What makes pronunciation teaching work? Testing for the effect of two variables: socially constructed metalanguage and critical listening. Language Awareness, 20, 159-182. [ Links ]

Darcy, I., Ewert, D., & Lidster, R. (2012). Bringing pronunciation instruction back into the classroom: An ESL teachers’ pronunciation "toolbox". In J. Levis & K. Lavelle (Eds.), Proceedings of the 3rd Pronunciation in Second Language Learning and Teaching Conference, Sept. 2011 (pp. 93-108). Ames, IA: Iowa State University. [ Links ]

Derwing, T. M., & Munro, M. J. (2015). Pronunciation fundamentals: evidence-based perspectives for L2 teaching and research (Vol. 42). Netherlands: John Benjamins Publishing Company. [ Links ]

Ellis, N. C. (2005). At the interface: Dynamic interactions of explicit and implicit language knowledge. Studies in Second Language Acquisition, 27, 305-352. [ Links ]

Flege, J. E. (1991). Age of learning affects the authenticity of voice‐onset time (VOT) in stop consonants produced in a second language. The Journal of the Acoustical Society of America, 89(1), 395-411. doi:10.1121/1.400473 [ Links ]

Flege, J. E. (1995). Second language speech learning: Theory, findings and problems. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in Cross Language Research (pp. 233-277). Timonium, MD: York Press. [ Links ]

Flege, J., Hammond, R. (1982). Mimicry of non-distinctive phonetic differences between language varieties. Studies in Second Language Acquisition, 5, 1-17. [ Links ]

Fraser, H. (2000). Coordinating improvements in pronunciation teaching for adult learners of English as a second language. Canberra: Department of Education, Training and Youth Affairs. [ Links ]

Fullana, N., & MacKay, I. R. (2003). Production of English sounds by EFL learners: The case of /i/ and /ɪ/. In Proceedings of the 15th international congress of phonetic sciences, ICPhS (pp. 1525-1528). Barcelona, Spain: ICPhS organizing comittee. [ Links ]

Fullana, N., & Mora, J. C. (2009) Production and perception of voicing contrasts in English word-final obstruents: Assessing the effects of experience and starting age. In M. A. Watkins, A. S. Rauber, & B. O. Baptista (Eds.), Recent research in second language phonetics/phonology: Perception and production. (pp. 97-117). Newcastle upon Tyne, UK: Cambridge Scholars Publishing. [ Links ]

García-Lecumberri, M. L., & Gallardo, F. (2003). English FL sounds in school learners of different ages. In M. P. García Mayo & M. L. García-Lecumberri (Eds.), Age and the Acquisition of English as a Foreign Language (pp.115-135). Clevedon: Multilingual Matters. [ Links ]

Gilbert, J. B. (2010). Why has pronunciation been an orphan?IATEFL Pronunciation Special Interest Group Newsletter, 43, 3-7. [ Links ]

Gordon, J., & Darcy, I. (2016). The development of comprehensible speech in L2 learners: A classroom study on the effects of short-term pronunciation instruction. Journal of Second Language Pronunciation, 2(1), 56-92. doi:10.1075/jslp.2.1.03gor [ Links ]

Gordon, J., Darcy, I., & Ewert, D. (2013). Pronunciation teaching and learning: Effects of explicit phonetic instruction in the L2 classroom. In Proceedings of the 4th Pronunciation in Second Language Learning and Teaching Conference (pp. 194-206). [ Links ]

Holec, H., 1981: Autonomy and foreign language learning. Oxford: Pergamon. (First published 1979, Strasbourg: Council of Europe) [ Links ]

Jilka, M. (2009). Talent and proficiency in language. In G. Dogil & S. Reiterer (Eds.), Language talent and brain activity (pp. 1-16). Berlin: Mouton De Gruyter. [ Links ]

Kennedy, S., & Blanchet, J. (2014). Language awareness and perception of connected speech in second language. Language Awareness, 23, 92-106. [ Links ]

Kennedy, S., Blanchet, J., & Trofimovich, P. (2014). Learner pronunciation, awareness, and instruction in French as a second language. Foreign Language Annals, 47, 79-96. [ Links ]

Kennedy, S., & Trofimovich, P. (2010). Language awareness and second language pronunciation: a classroom study. Language Awareness, 19, 171-185. [ Links ]

Kissling, E. M. (2013). Teaching pronunciation: Is explicit phonetics instruction beneficial for FL learners?. The modern language journal, 97(3), 720-744. [ Links ]

Kivistö-de Souza, H. (2015). Phonological awareness and pronunciation in a second language. Unpublished doctoral dissertation. University of Barcelona, Barcelona, Spain. [ Links ]

Kivistö-de Souza, H. (2017). The relationship between phonotactic awareness and pronunciation in adult second language learners. Revista Brasileira de Linguística Aplicada, 17, 185-214. doi: 10.1590/1984-6398201610850 [ Links ]

Larson-Hall, J. (2008). Weighing the benefits of studying a foreign language at a younger starting age in a minimal input situation. Second Language Research, 24(1), 35-63. [ Links ]

Logan, J., & Pruitt, J. (1995). Methodological issues in training listeners to perceive non-native phonemes. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in Cross Language Research (pp. 351-378). Timonium, MD: York Press. [ Links ]

Long, M. H. (1991). Focus on form: A design feature in language teaching methodology. In K. de Bot, R. Ginsberg, & C. Kramsch (Eds.), Foreign language research in cross-cultural perspective (pp. 39-52). Amsterdam: John Benjamins. [ Links ]

Long, M. H., & Larsen-Freeman, D. (1991). An introduction to second language acquisition research. Harlow: Longman. [ Links ]

Mora, J. C. (2005). Lexical knowledge effects on the discrimination of non-native phonemic contrasts in words and non-words by Spanish/Catalan bilingual learners of English. In ISCA Workshop on Plasticity in Speech Perception. [ Links ]

Mora, J.C., Rochdi, Y., & Kivistö-de Souza, H. (2014). Mimicking accented speech as L2 phonological awareness. Language Awareness, 23, 57-75. doi:10.1080/09658416.2013.863898 [ Links ]

Moyer, A. (1999). Ultimate attainment in L2 phonology: The critical factors of age, motivation and instruction. Studies in Second Language Acquisition, 21, 81-108. [ Links ]

Monje-Sangüesa, V. (2016). The second time around: The effect of FI upon return from SA. (Unpublished MA Thesis). University Pompeu Fabra: Barcelona, Spain. [ Links ]

Munro, M., & Derwing, T. (2001). Modeling perceptions of the accentedness and comprehensibility of L2 speech; The role of speaking rate. Studies in Second Language Acquisition, 23, 451-468. [ Links ]

Muñoz, C. (2008). Symmetries and asymmetries of age effects in naturalistic and instructed L2 learning. Applied Linguistics, 29(4), 578-596. doi:10.1093/applin/amm056 [ Links ]

Muñoz, C. (2011). Input and long-term effects of starting age in foreign language learning. IRAL -International review of Applied Linguistics in Language Teaching, 49(2), 113-133. doi:10.1515/iral.2011.006 [ Links ]

Purcell, E. T., & Suter, R. W. (1980). Predictors of pronunciation accuracy: A reexamination. Language learning, 30(2), 271-287. [ Links ]

Piske, T. (2008). Phonetic awareness, phonetic sensitivity and the second language learner. In Encyclopedia of language and education (pp. 1912-1923). New York, US: Springer Publishing. [ Links ]

Rallo-Fabra, L., & Romero, J. (2012). Native Catalan learners' perception and production of English vowels. Journal of Phonetics, 40(3), 491-508. doi:10.1016/j.wocn.2012.01.001 [ Links ]

Ramírez Verdugo, D. (2006). A study of intonation awareness and learning in non-native speakers of English. Language Awareness, 15, 141-159. [ Links ]

Rato, A., & Rauber, A. (2015). The effects of perceptual training on the production of English vowel contrasts by Portuguese learners. In the Scottish Consortium for ICPhS 2015 (Ed.), Proceedings of the 18th International Congress of Phonetic Sciences. Glasgow, UK: Glasgow University. [ Links ]

Rauber, A., Rato, A., Kluge, D., & Santos, G. (2012). TP (Version 3.1). [Software]. Brazil: Worken. []. [ Links ]

Saito, K. (2012). Effects of instruction on L2 pronunciation development: A synthesis of 15 quasi‐experimental intervention studies. TESOL Quarterly, 46(4), 842-854. [ Links ]

Saito, K. (2013). The acquisitional value of recasts in instructed second language speech learning: Teaching the perception and production of English /ɹ/ to adult Japanese learners. Language Learning, 63, 499-529. [ Links ]

Saito, K. (2015). Variables affecting the effects of recasts on L2 pronunciation development. Language Teaching Research, 19(3), 276-300. doi:10.1177/1362168814541753 [ Links ]

Saito, K., & Lyster, R. (2011). Effects of form-focused instruction and corrective feedback on L2 pronunciation development of /ɹ/ by Japanese learners of English. Language Learning, 62(2), 595-633. doi:10.1111/j.1467-9922.2011.00639.x [ Links ]

Setter, J., & Jenkins, J. (2005). State-of-the-art review article. Language Teaching, 38(1), 1-17. [ Links ]

Scovel, T. (1988). A time to speak. A psycholinguistic inquiry into the critical period for human speech. Cambridge, MA: Newbury House. [ Links ]

Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129-158. [ Links ]

Schmidt, R. (1995). Consciousness and foreign language learning: a tutorial on the role of attention and awareness in learning. In Schmidt, R. (Ed.), Attention and awareness (pp. 1-63). Honolulu, HI: University of Hawai`i, National Foreign Language Resource Center. [ Links ]

Silveira, R., & Alves, U. (2009). Noticing e intrução explícita: Aprendizagem fonético-fonológica do morfema-ed. Nonada Letras em Revista, 2(13), 149-159. [ Links ]

Solé, M. J., & Estebas, E. (2000). Phonetic and phonological phenomena: VOT: A cross-language comparison. In Proceedings of the 18th AEDEAN Conference (pp. 437-44). Vigo: University of Vigo. [ Links ]

Thomson, R. I., & Derwing, T. M. (2014). The effectiveness of L2 pronunciation instruction: A narrative review. Applied Linguistics, 36(3), 326-344. doi:10.1093/applin/amu076 [ Links ]

VanPatten, B. (1996). Input processing and grammar instruction in second language acquisition. Norwood: Ablex Publishing. [ Links ]

Venkatagiri, H. S., & Levis, J. (2007). Phonological awareness and speech comprehensibility: An exploratory study. Language Awareness, 16, 263-277. [ Links ]

Vitevitch, M. S., & Luce, P. A. (1999). Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language, 40(3), 374-408. [ Links ]

White, J., & Ranta, L. (2002). Examining the interface between metalinguistic task performance and oral production in a second language. Language Awareness, 11, 259-290. [ Links ]

Wrembel, M. (2005). Phonological Metacompetence in the Acquisition of Second Language Phonetics (Unpublished Doctoral dissertation). Adam Mickiewicz University, Poznan. [ Links ]

Wrembel, M. (2011). Metaphonetic awareness in the production of speech. In M. Pawlak, E. Waniek-Klimczak & J. Majer (Eds.), Speaking and instructed foreign language acquisition (pp. 169-182). Clevedon: Multilingual Matters. [ Links ]

Wrembel, M. (2013). Metalinguistic awareness in third language phonological acquisition. In K. Roehr & G. A. Gánem-Gutiérrez (Eds.), The metalinguistic dimension in instructed second language learning (pp. 119-144). London: Bloomsbury [ Links ]

1 Throughout the paper, L2 refers to a second language learned in the foreign language (FL) setting. Thus, L2 and FL will be used interchangeably.

2For the purposes of this paper, formal instruction will refer to all the exposure to the TL received formally, that is, in the FL classroom setting.

3The data presented in this study come from a larger-scale PhD project investigating the effect of phonetic training on the perception and production of vowel and consonant sounds. The group presented here served the purpose of a control group in the larger study.

4The English variety adopted in the courses was SSBE, however students’ accuracy of SSBE English vowel production was only controlled for/assessed during the Phonetics and Phonology classes.

5Not all instructors were speakers of the SSBE variety of English and as previously mentioned, some instructors were non-native speakers of English.


7For the purpose of the present study, the learners’ language dominance was not expected to play role as Catalan and Spanish do not differ in terms of their L1 vowels in relation to the five vowels under study.

8All perception tasks used in this study were validated by three SSBE NES.

9Outside classroom exposure to the TL was not controlled for in the present investigation.

* Professor of English linguistics and head of English at the faculty of Education at Universitat Internacional de Catalunya (UIC), in Barcelona, Spain. Sheholds a PhD in English philology and a Master’s Degree in second language acquisition from the Universitat Autònoma de Barcelona (UAB). The influence of one’s native language phonology in the acquisition of a second language constitutes the main area of her research interest along with the effect of formal instruction and stay abroad programmes on participants’ oral skills. Her e-mail address is

** Holds a PhD in Applied Linguistics from the University of Barcelona and is an Assistant Professor at the Federal University of Santa Catarina (UFSC). Her research focuses on L2 speech acquisition by looking into the individual cognitive variables that underlie L2 speech perception and production. Her e-mail address is

Appendix 1

Testing stimuli (Non-words)
/æ-ʌ/ /ɪ-i/ /ɜː/
vab Vap veeb veep jurb
zad Zat jeed jeet jerd
vag Vack veeg veek verg
vub Vup vib vip jurp
zud Zut jid jit jurt
vugg Vuck vig vick verk
Testing stimuli (Real words)
/æ-ʌ/ /ɪ-i/ /ɜː/
cap Cab feet feed hurt heard
pup Pub bit bid
Production elicitation list
/æ-ʌ/ /ɪ-i/ /ɜː/
cap Cab bit bid hurt heard
buck Bug feet feed

Received: November 15, 2017; Accepted: July 03, 2018

Creative Commons License This is an open-access article distributed under the terms of the Creative Commons Attribution License