Speech fluency as an indicator of second language (L2) oral proficiency has gained increasing interest in recent years. The accumulated research has established fluency as a multi-faceted phenomenon, which can be reliably measured from L2 speech samples with a set of temporal measures, such as articulation rate or the frequency of silent pauses. However, emerging evidence suggests that many measures traditionally used for examining L2 speech fluency might be connected to or influenced by the equivalent measures in the speaker’s first language (L1; for a meta-analysis, see Gao & Sun, Reference Gao and Sun2024). That is, L2 fluency measures may not be as straightforward indicators of L2 proficiency as previously thought but can rather be prone to influences from the speaker’s personal speaking style (i.e., the temporal features in their L1 speech). Therefore, it is important to emphasize that features that are linked to a speaker’s personal style should not impact estimates of their L2 proficiency negatively. Furthermore, while task-based second language acquisition (SLA) research has widely examined the impact of task features on L2 performance (e.g., Skehan et al., Reference Skehan, Foster and Shum2016), including task mode effects (e.g., Gilabert et al., Reference Gilabert, Barón, Levkina and Robinson2011), the bulk of L2 speech fluency research has focused on monologic speech, such as picture descriptions based on comic strip prompts. This may provide an inadequate view of the speakers’ L2 skills, overlooking the interactive element of L2 speech fluency (e.g., Peltonen, Reference Peltonen2020; Wright & Tavakoli, Reference Wright and Tavakoli2016). While some previous studies exploring task mode effects on L2 speech fluency have suggested that dialogic contexts might facilitate fluent speech production (e.g., Tavakoli, Reference Tavakoli2016; Witton-Davies, Reference Witton-Davies2014), research in this area is still relatively limited, and more studies are needed to confirm the tendencies discovered in previous research. Importantly, previous task-based studies on fluency have rarely included L1 speech data from the participants along with their L2 production.
The present study is thus among the first to bring together these two strands of L2 speech fluency research, combining the analysis of L1 speaking style and task mode effects on L2 fluency in a research design that includes L1 and L2 speech data from the same speakers based on monologic and dialogic tasks. As L2 proficiency level has been found to be linked to L2 fluency (e.g., Tavakoli et al., Reference Tavakoli, Nakatsuhara and Hunter2020) and to potentially influence the L1–L2 fluency relationship (e.g., Peltonen, Reference Peltonen2018), we also include the learners’ proficiency level in the analyses (based on their LexTALE scores; Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012). Applying this unique design to the analysis of speech data from 50 Finnish-speaking advanced learners of English, we analyzed the monologic and dialogic tasks for speed, breakdown, repair, and composite fluency (Skehan, Reference Skehan2009). Multiple linear regressions were conducted to examine the extent to which measures of L2 fluency can be predicted from L1 fluency measures, task mode (monologue vs. dialogue), and L2 proficiency level. Thus, the study importantly extends previous L2 speech fluency research by examining both monologic and interactional data, while also broadening research on L1 speaking style effects to a novel, dialogic context. Our study provides significant theoretical and methodological contributions to the field of L2 speech fluency research: we extend fluency analyses to an interactive setting and thus revisit the mostly monologue-based fluency theories and methodologies. Furthermore, through this design, we are able to provide a novel angle to the bulk of research exploring L1 speaking style effects on L2 speech fluency, which, to our knowledge, have not yet been explored in interactive settings.
Theoretical background
Speech fluency
Speech fluency is widely examined as an element of L2 oral proficiency pertaining to the smoothness and effortlessness of speech (Chambers, Reference Chambers1997), which can be captured by examining the rate of speech, pausing, and repair phenomena. Chambers’ (Reference Chambers1997) characterization of fluency as mainly a temporal phenomenon is in line with Lennon’s (Reference Lennon1990) view of fluency in the narrow sense, which is generally applied in L2 fluency research contexts: speech fluency is often studied by examining (dis)fluency features, such as silent pauses, in L2 speech, and thus separated from other aspects of L2 proficiency, such as accuracy or complexity (see Housen et al., Reference Housen, Kuiken and Vedder2012). In contrast, in Lennon’s (Reference Lennon1990) broad sense, fluency is equated with the overall (oral) proficiency level in a particular L2. In the present study, fluency is understood in the narrow sense. We operationalize this narrow sense by employing Skehan’s (Reference Skehan2009, Reference Skehan and Skehan2014) oft-cited three-fold framework in our selection of fluency measurements, which relate to the dimensions of speed, breakdown, and repair fluency, along with the composite measure speech rate that combines the dimensions of speed and breakdown. In terms of Segalowitz’s (Reference Segalowitz2010) threefold framework of cognitive, perceived, and utterance fluency dimensions, the focus is thus on utterance fluency; that is, the measurable aspects of speech associated with smoothness and effortlessness (or the lack of them; Chambers, Reference Chambers1997), reflecting a narrow understanding of fluency (Lennon, Reference Lennon1990). Cognitive fluency, which refers to the smoothness of underlying processing, is not directly examined in the present study, but is reflected, to some extent, in the temporal fluency measurements used in the present study. However, it should be acknowledged that the link between the cognitive processing and speech fluency features is more complex in interactive settings than in monologues due to the presence of the interlocutor (see, e.g., Gilabert et al., Reference Gilabert, Barón, Levkina and Robinson2011). Perceived fluency (listeners’ perceptions of speakers’ fluency; e.g., fluency assessments) is excluded from the scope of the present study altogether.
The bulk of L2 speech fluency research has been conducted on monologic speech data, with the majority of fluency definitions, theories, and measurements reflecting the view of fluency as an individual speaker’s characteristic (see also Peltonen, Reference Peltonen2024). However, as our everyday contexts involving the use of L2 often take place in dialogic settings with an interlocutor, fluency researchers have recently advocated for more social perspectives on fluency to increase the ecological validity of L2 speech fluency research (e.g., Peltonen, Reference Peltonen2020, Reference Peltonen2024; Tavakoli, Reference Tavakoli2016; Wright & Tavakoli, Reference Wright and Tavakoli2016). While still being less popular than studies conducted on monologic data, since Riggenbach’s (Reference Riggenbach1991) pioneering study on fluency in a dialogic setting, some fluency studies have begun exploring L2 fluency from an interactional perspective: for instance, Sato’s (Reference Sato2014) study combined utterance and perceived fluency perspectives and Van Os et al. (Reference Van Os, De Jong and Bosker2020) conducted an experimental study on perceived interactional fluency. Both of these studies suggest that, conceptually, fluency in an interactive setting differs from fluency in monologic settings.
A similar notion is echoed in McCarthy’s (Reference McCarthy2010) concept of confluence, which suggests that fluency is jointly maintained by both participants in interaction. Peltonen’s (Reference Peltonen2020) notion of interactional fluency (previously termed dialogue fluency in Peltonen, Reference Peltonen2017) builds on the idea of confluence and refers to the participants’ joint efforts in maintaining the flow of speech across individual turns. Thus, as discussed in Peltonen (Reference Peltonen2020), in dialogic contexts, along with examining individual fluency, which refers to the within-turn fluency features attributable to the individual speakers participating in the interaction, the social-collaborative aspect of interactional fluency can be captured by examining between-turn phenomena. While individual fluency measures can thus be applied to monologic and dialogic data, interactional fluency measures capture the phenomena unique to fluency in a dialogic context, such as between-turn pauses (Peltonen, Reference Peltonen2020). To ensure comparability across the monologic and dialogic conditions, the present study focuses on the individual fluency measures but incorporates elements of interactional fluency in the silent pause and speech rate measures by including between-turn silent pauses in the analyses (see also Tavakoli, Reference Tavakoli2016).
L1–L2 connections and L2 proficiency level effects on speech fluency
Recent fluency research has widely explored the impact of an individual’s speaking style, i.e., the fluency-related, temporal features in one’s L1 speech, on L2 speech fluency (e.g., De Jong et al., Reference De Jong, Groenhout, Schoonen and Hulstijn2015; De Jong & Mora, Reference De Jong and Mora2019; Derwing et al., Reference Derwing, Munro, Thomson and Rossiter2009; Duran-Karaoz & Tavakoli, Reference Duran-Karaoz and Tavakoli2020; Gagné et al., Reference Gagné, French and Hummel2025; Gao & Sun, Reference Gao and Sun2023; Huensch, Reference Huensch2023; Huensch & Tracy-Ventura, Reference Huensch and Tracy-Ventura2017; Jiránková et al., Reference Jiránková, Gráf, Kvítková, Abel, Glaznieks, Lyding and Nicolas2019; Peltonen, Reference Peltonen2018; Peltonen & Lintunen, Reference Peltonen and Lintunen2022; Pérez Castillejo & Urzua-Parra, Reference Pérez Castillejo and Urzua-Parra2023). This line of research has shifted the focus from between-group comparisons across non-native and native speakers to within-group, or more specifically within-individual comparisons, highlighting the variation across speakers regarding fluency (e.g., Götz, Reference Götz2013). Following the pioneering longitudinal study exploring the correlations between 16 Slavic and 16 Mandarin speakers’ L1 fluency and L2 English fluency by Derwing et al. (Reference Derwing, Munro, Thomson and Rossiter2009), the bulk of studies exploring the connections between L1 and L2 speech fluency has been conducted within the past ten years, highlighting the current nature of the topic. To date, the majority of these studies have been based on monologic data (but see Gagné et al., Reference Gagné, French and Hummel2025; Huensch, Reference Huensch2023). De Jong et al.’s (Reference De Jong, Groenhout, Schoonen and Hulstijn2015) study explored English (n = 29) and Turkish (n = 24) speakers’ fluency in their L1 and L2 Dutch, focusing especially on whether measures corrected for L1 fluency would be better predictors of L2 proficiency than uncorrected measures. Regression analyses showed that the breakdown fluency measure mean length of silent pauses between Analysis of Speech Units (AS-units; Foster et al., Reference Foster, Tonkyn and Wigglesworth2000) had the strongest predictive power: the L1 measure explained 57% of the variance in the corresponding L2 measure. However, the only measure where the L1-corrected measure proved to be a stronger predictor of L2 proficiency was the mean syllable duration (inverse articulation rate, a measure of speed fluency).
Extending De Jong et al.’s (Reference De Jong, Groenhout, Schoonen and Hulstijn2015) design with a longitudinal component, Huensch and Tracy-Ventura (Reference Huensch and Tracy-Ventura2017) examined English-speaking adult learners of Spanish (n = 24) and French (n = 25) for their L1 and L2 fluency before and after study abroad. They also found significant correlations across several L1 and L2 fluency measures, but, combined with an analysis of cross-linguistic differences and changes over time, their findings underscore the complex interplay between L1 fluency measurements, cross-linguistic differences, and L2 proficiency in influencing L2 fluency. In a recent conceptual replication of Huensch and Tracy-Ventura (Reference Huensch and Tracy-Ventura2017), Pérez Castillejo and Urzua-Parra (Reference Pérez Castillejo and Urzua-Parra2023) examined the same native and target languages (L1 English, L2 Spanish) among a lower proficiency level group (A2 to B1 in the Common European Framework of Reference for Languages [CEFR]; Council of Europe, 2001) with a different task (self-referential monologue), excluding the study abroad component and focusing on an instructed setting (see also Duran-Karaoz & Tavakoli, Reference Duran-Karaoz and Tavakoli2020; Peltonen, Reference Peltonen2018). Huensch and Tracy-Ventura’s (Reference Huensch and Tracy-Ventura2017) findings were replicated for speed and repair fluency measures, as well as silent pause duration, but the results concerning silent and filled pause frequency were different (Pérez Castillejo & Urzua-Parra, Reference Pérez Castillejo and Urzua-Parra2023). Together, the findings from the studies suggest that there are some L1–L2 fluency relationships that emerge early on and remain stable with increasing L2 proficiency, while other aspects may be more prone to task type and proficiency level effects (Pérez Castillejo & Urzua-Parra, Reference Pérez Castillejo and Urzua-Parra2023), echoing Derwing et al.’s (Reference Derwing, Munro, Thomson and Rossiter2009) suggestion that some aspects of fluency may be more traitlike and others more statelike.
Kahng’s (Reference Kahng2020) study of 44 Chinese learners of L2 English disentangled the relationships between L1 and L2 speech fluency further by including cognitive fluency in the analyses along with utterance fluency indicators. She found moderate to strong correlations between the majority of the examined L1 and L2 fluency measures (strongest correlation found for mean silent pause duration, r = .57), and the stepwise multiple regression analyses showed that for most L2 fluency measures, both L2-specific cognitive fluency measures and the corresponding L1 fluency measure were found to be significant predictors. More recently, Suzuki and Kormos (Reference Suzuki and Kormos2024) explored the moderating effect of cognitive fluency (linguistic resources and processing speed) on the L1–L2 fluency relationship, demonstrating significant effects for the dimension of speed fluency, but not for breakdown or repair dimensions. While the present study does not address cognitive fluency, the results of these studies point to the importance of examining the potential moderating effects on the L1–L2 fluency relationship and extending this line of research from monologic to dialogic conditions.
Out of the potential moderating effects, the present study explores the impact of L2 proficiency on the relationship between L1 and L2 fluency (see also Götz, Reference Götz2019a). In a study examining L1–L2-connections in speech fluency among Finnish-speaking learners of English (N = 42) at two school levels (Group 1 ninth-grade, 15-year-olds; Group 2 upper secondary school, 17-year-olds; roughly representing the levels B1 and B2 in the CEFR, respectively) based on picture description tasks, Peltonen (Reference Peltonen2018) demonstrated, overall, stronger L1–L2 fluency correlations for the more proficient group. In another recent study involving Turkish L1 speakers of English (N = 42) at A2, B1, and B2 levels in the CEFR, statistically significant moderate to strong correlations between L1 and L2 were found especially for breakdown and repair fluency measures; yet, partial correlations controlling for proficiency level did not substantially influence the strength of the associations (Duran-Karaoz & Tavakoli, Reference Duran-Karaoz and Tavakoli2020). Thus, the two studies provide somewhat differing results regarding the potential role of L2 proficiency level in influencing L1–L2 connections. Yet, taken together with the previously discussed longitudinal studies (e.g., Derwing et al., Reference Derwing, Munro, Thomson and Rossiter2009; Huensch & Tracy-Ventura, Reference Huensch and Tracy-Ventura2017) suggesting potential shifts in L1 speaking style influence with developing L2 proficiency, proficiency level is likely to play at least some role in L1–L2 fluency connections, motivating our choice to examine its impact in the present study. Further support for examining the effect of L2 proficiency level on the connections between L1 and L2 fluency comes from L2 speech fluency research exploring how different aspects of fluency levels develop as the learners’ proficiency level increases, results generally pointing to higher fluency for higher proficiency level participants (e.g., Kormos & Dénes, Reference Kormos and Dénes2004; Tavakoli et al., Reference Tavakoli, Nakatsuhara and Hunter2020; for a meta-analysis, see Yan et al., Reference Yan, Lei and Pan2025).
Recently, the scope of L2 fluency research has been extended to cover a broad range of participants from different age groups and linguistic backgrounds. As the majority of previous studies have focused on adult learners (but see Peltonen, Reference Peltonen2018), Gao and Sun (Reference Gao and Sun2023) focused on younger, 14-year-old Chinese learners of English (N = 47). Based on fluency analyses of L1 and L2 speech, they found overall somewhat weaker correlations between L1 and L2 speech fluency measures than previous studies. These findings highlight the L1 linguistic development of teenagers as a potential factor in explaining (the lack of) correlations between L1 and L2 fluency (Gao & Sun, Reference Gao and Sun2023), along with the previously suggested L2 proficiency (see, e.g., Peltonen, Reference Peltonen2018). Another recent contribution to the field has been to extend the study of connections between L1 and L2 fluency to multilingual learners and to examine the connections in fluency across multiple languages in the learners’ repertoire. A recent study exploring the connections across L1 Finnish, L1/L3 Swedish, and L2 English speech among two groups of participants (Group 1 L1 Finnish, n = 20, and Group 2 Finnish-Swedish bilingual university students, n = 10) based on a monologic picture description task demonstrated the strongest correlations between L1 Finnish and L2 English for Group 1, while the correlations for Group 2 were, overall, moderate to strong across the different language pairs (Peltonen & Lintunen, Reference Peltonen and Lintunen2022). This finding was related to the overall lower proficiency level in L3 Swedish for Group 1, further highlighting the potential role of L2/L3 proficiency level in influencing the connections across L1 and L2 speech fluency.
The rapidly accumulating findings from this field have also recently been examined with a meta-analysis exploring the impact of L1 speaking style on L2 speech fluency and the factors influencing the connections between L1 and L2 fluency (Gao & Sun, Reference Gao and Sun2024). Incorporating many of the studies discussed above, the findings of the meta-analysis suggest that the degree of L1 influence varies across different dimensions of fluency, strongest effects being found for the breakdown fluency dimension and the weakest for repair fluency (Gao & Sun, Reference Gao and Sun2024). Moreover, the findings indicate that L1 effects are subject to various factors, including the L2 learning contexts, task type, and task consistency. While Gao and Sun (Reference Gao and Sun2024) only examined task type in terms of open (e.g., interview) vs. closed (e.g., picture narration) tasks and not based on monologic vs. dialogic tasks, their finding of open tasks exhibiting stronger L1–L2 correlations could be reflected in the present study’s data as well, as the dialogic task is more open in nature than the monologic one.
Task mode effects on speech fluency
In SLA research, the impact of task characteristics on the dimensions of L2 complexity, accuracy, and fluency (CAF; see, e.g., Housen et al., Reference Housen, Kuiken and Vedder2012) has been examined extensively. Some studies have manipulated task features, such as task difficulty (e.g., Préfontaine & Kormos, Reference Préfontaine and Kormos2015) or narrative structure (e.g., Skehan et al., Reference Skehan, Foster and Shum2016) within one task mode, often monologue, while other studies have manipulated task features, such as task complexity, in both monologic and dialogic conditions (e.g., Gilabert et al., Reference Gilabert, Barón, Levkina and Robinson2011; Michel, Reference Michel and Robinson2011; Wright, Reference Wright2021), thus incorporating task mode effects in their designs. In contrast, L2 speech fluency research tends to use research designs including only one task type that elicits monologic speech (e.g., picture-based narratives). The preference for monologic data is potentially due to the more controlled nature of the task itself and the more straightforward way of measuring (an individual’s) fluency performance as opposed to interactive data (see Tavakoli, Reference Tavakoli2016). However, some studies have explored the effects of monologic vs. dialogic task mode on L2 speech fluency (e.g., Gráf, Reference Gráf, Götz and Mukherjee2019; Lehtilä, Reference Lehtilä2021; Tavakoli, Reference Tavakoli2016), occasionally incorporating a fluency development aspect in the design (e.g., Witton-Davies, Reference Witton-Davies2014). In these studies, the dialogic task has involved L2 peer interaction, but unlike in the present study, L1 speech data from the participants has not been included in the research design. In contrast, a few recent studies (Gagné et al., Reference Gagné, French and Hummel2025; Huensch, Reference Huensch2023) exploring the L1–L2 fluency connections have incorporated dialogic data along with monologic data, but these have been based on semi-structured interviews. The dialogic task in these studies is thus less symmetrical than the peer interaction data used in other fluency studies examining task mode effects. Therefore, the focus of the present study is unique, as it incorporates dialogic L2 peer interaction data along with monologic data and corresponding L1 samples from the participants in both modes.
The accumulating evidence from task-based fluency studies suggests that performances in the dialogic condition tend to be more fluent than in the monologic condition (e.g., Michel, Reference Michel and Robinson2011; Wright, Reference Wright2021). Examining 64 L2 learners of Dutch performing monologic and dialogic tasks with CAF measures, including four indices of fluency (pruned and unpruned speech rate, number of filled pauses, and number of repairs), Michel (Reference Michel and Robinson2011) found that the L2 learners produced faster speech and fewer filled pauses and repairs in the dialogic condition. She attributed this finding to the possibility to plan one’s own turn during the interlocutor’s turn, which alleviates the pressure on speech planning and production and makes the dialogic task potentially cognitively less demanding. Michel (Reference Michel and Robinson2011) also noted that participants align their behavior in interaction in various ways (on alignment, see Pickering & Garrod, Reference Pickering and Garrod2004), which may account for previous findings of weaker correlations between fluency (measured with speech rate and the frequency of filled pauses) and L2 proficiency in the dialogic mode compared to the monologic mode (Gilabert et al., Reference Gilabert, Barón, Levkina and Robinson2011).
More recently, Wright (Reference Wright2021) examined the development of L2 Mandarin (N = 10) during study abroad, incorporating both monologic and dialogic data (rehearsed and spontaneous conditions in both modes). Using measures of articulation rate, hesitation rate, mean length of run, mean length of silent pauses, and number of silent pauses to capture fluency, the results suggest that the spontaneous dialogic task after study abroad was more fluent than the corresponding monologic task in terms of the pausing measures, in particular. These findings complement Michel’s (Reference Michel and Robinson2011) results from the perspective of silent pauses and a different L2, adding further evidence to the facilitative effect of the dialogic mode.
As the previously discussed task-based studies often examine fluency in the broader CAF context, fewer fluency measures are usually employed than in studies focusing solely on L2 speech fluency. Yet, results from studies on L2 fluency across monologic and dialogic conditions also generally point to higher fluency in dialogues (e.g., Lehtilä, Reference Lehtilä2021; Tavakoli, Reference Tavakoli2016; Witton-Davies, Reference Witton-Davies2014), supporting findings from task-based CAF studies. Tavakoli (Reference Tavakoli2016) examined 35 L2 English university students’ monologic (personal narrative) and dialogic (argumentative discussion) performances with a range of measures capturing the speed, breakdown, repair, and composite dimensions of fluency, along with two dialogue-specific measures (number of turns and number of interruptions). The results demonstrated statistically significantly more fluent performance in the dialogue on speed (articulation rate), length-based breakdown (mean length of silent pauses), and composite (speech rate, mean length of run, phonation-time ratio) measures, but not on the frequency or location of silent pauses, regardless of whether between-turn pauses were included in the analyses or not. In the present study, the “joint” silent pauses (between-turn pauses, TPs) were incorporated in the analyses of the dialogic data.
Similar results to Tavakoli (Reference Tavakoli2016) were obtained by Witton-Davies (Reference Witton-Davies2014) and Lehtilä (Reference Lehtilä2021). Witton-Davies’ (2014) study included a longitudinal component, as he examined Taiwanese L2 English university students’ (N = 17) fluency development over four years based on monologic (picture story retell) and dialogic (discussion) task performances. Witton-Davies (Reference Witton-Davies2014) found that the dialogic performance was more fluent on several fluency measures, including speed (articulation rate), breakdown (frequency and duration of silent pauses), repair (especially the frequency of reformulation), and composite (speech rate) measures. Based on analyses of Finnish upper secondary school learners’ (N = 22) L2 English monologic (picture description) and dialogic (problem-solving task) speech, Lehtilä’s (Reference Lehtilä2021) findings also support more fluent performance in a dialogic condition, notably for breakdown (frequencies and mean lengths of mid-clause and end-clause silent pauses) and composite (speech rate) measures. In contrast to Tavakoli’s (Reference Tavakoli2016) and Witton-Davies’ (2014) findings, however, Lehtilä (Reference Lehtilä2021) did not find a statistically significant difference in the pure speed measure of articulation rate.
Summary and research question
Taking together the research discussed above, there is compelling evidence demonstrating that L1 speaking style and task mode impact L2 speech fluency, underscoring the need to combine the two research traditions and examine the impact of these factors together. The present study thus advances the field by examining the effects of both L1 speaking style and task mode on L2 speech fluency. In addition, we incorporate the effect of L2 proficiency in our analyses to provide insights into a factor potentially impacting the relationship between L1 and L2 speech fluency. Our study thus addresses the following research question:
To what extent can L2 speech fluency measures be predicted based on L1 speech fluency measures, task mode (monologue vs. dialogue), and L2 proficiency level?
Methodology
Participants
The sample included 50 university students of English (42 major subject students, 8 minor subject students), selected from a pool of participants that participated in a larger fluency-themed project (Fluency and Disfluency Features in L2 Speech, funded by the Research Council of Finland in 2020–2024; decision number 331903). The English majors in the sample had started their studies in the year of the data collection. All participants spoke Finnish as their L1. The average age was 21.9 years (SD = 4.88; median 20). Of the 50 participants, 58% identified as female, 34% as male, 2% as other, and 6% did not want to disclose their gender. On average, the participants had studied English for a total of 10 years (SD = 1.04; min. 7, max. 13) at school before university. Their proficiency level was estimated at C1/C2 level in the CEFR on average, based on the LexTALE receptive vocabulary test scores (M = 86.4%, SD = 8.06; scores over 80% indicate level C1/C2, Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012). LexTALE is a widely used, validated vocabulary test in SLA research, having demonstrated substantial correlations with overall proficiency in English (see Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012). In total, 74% of the participants represented the C1/C2 level (scores above 80%), with the remaining 26% representing the lower B2 level (scores between 60% and 80%). Fourteen participants had stayed abroad for more than a month (e.g., studying abroad). None of the participants reported language-related impairments. All participants signed informed consent forms, and the processing of personal data was detailed in privacy notices provided during data collection, in line with the EU’s General Data Protection Regulation.
Data collection and procedure
Two types of speech data are used in the present study: monologic speech based on picture description (monologic task) and dialogic speech based on a problem-solving task, completed in pairs (dialogic task). The participants completed both the monologic and the dialogic tasks in their L1 Finnish and L2 English. The monologic speech samples in L1 Finnish and L2 English were collected in a group setting in a language laboratory using two comparable comic strip prompts in terms of their length (six frames) and clarity of the storyline. The order of the prompts and languages was counterbalanced, and the speech samples were audio-recorded. Following two minutes of planning time, the participants were asked to tell the story conveyed by the prompts in their own words while looking at the cartoon. This procedure was then repeated in the other language.
During the same day, after having completed other parts of the data collection (not reported in this study), the participants completed the two comparable dialogic tasks in a quiet room with a randomly chosen pair, with only a research assistant present to audio and video record the interactions. After two minutes of individual planning time, the participants were given six minutes to complete the task. Research assistants announced when five minutes had passed but did not otherwise intervene. The procedure was repeated in the other language with the same pair. As with the monologic task, the order of the task prompts and the languages was counterbalanced in the data collection to avoid task order effects. The dialogic tasks were based on two problem-solving scenarios, both involving sixteen items (black-and-white images printed on paper) that had to be ranked in the order of usefulness for two different purposes. In task A, the items were to be used to facilitate survival after having been stranded on a desert island (see Peltonen, Reference Peltonen2025), while in task B, the participants were asked to imagine that they had crash landed on the moon and had to make their way to the mothership with the help of the items. In addition to justifying the purpose of each item for survival, the instructions encouraged the participants to achieve a joint agreement on the order of importance for the items.
The monologic speech samples (mean sample duration in L1 Finnish = 62.51 sec., SD = 16.98; mean sample duration in L2 English = 67.99 sec., SD = 24.56; L1 Finnish syllables M = 243, SD = 77.23; L2 English syllables M = 181, SD = 72.64) were transcribed and double-checked by a group of research assistants as part of course work. Silent pauses of at least 0.25 seconds (De Jong & Bosker, Reference De Jong, Bosker and Eklund2013) were identified by the first author in Praat (Boersma & Weenink, Reference Boersma and Weenink2022) using a script (De Jong & Wempe, Reference De Jong and Wempe2009). The silent pause annotations were manually checked by the research assistants, who also manually annotated the self-repetitions in Praat. Finally, a script for extracting speaking time durations, silent pause frequencies, silent pause durations, and the frequencies of self-repetitions (Lennes, Reference Lennes2002) was used to enable calculations of the fluency measurements. The frequency-based fluency measures were standardized per minute of speaking time (total time excluding silent pause time; De Jong, Reference De Jong, Tsagari and Banerjee2016).
The dialogic speech samples (mean sample duration in L1 Finnish = 347.29 sec., SD = 42.61; mean sample duration in L2 English = 364.57 sec., SD = 61.64; L1 Finnish syllables M = 764, SD = 271.54; L2 English syllables M = 587, SD = 215.89) were transcribed and double-checked by two MA students of English, followed by final checks by the first author. The silent pauses were identified with the same Praat script used for the monologic samples (minimum duration 0.25 seconds), but an upper limit for silent pauses was manually set at three seconds (Witton-Davies, Reference Witton-Davies2014). The purpose of the upper limit (and excluding the exceeding pause time from further analyses) was to reduce the impact of such long pauses on speech rate and mean length of silent pause calculations, as they are less clearly linked to individual disfluencies than shorter pauses. The first author manually allocated the speaking time for each participant in Praat and annotated the silent pauses as within-turn (individual) or between-turn (shared) pauses. In addition to silent pauses occurring between the participants’ turns, between-turn silences longer than 2 seconds were considered shared pauses, i.e., the responsibility of both speakers, and included in between-turn pause calculations (Witton-Davies, Reference Witton-Davies2014). The frequencies and durations of between-turn pauses were divided equally for each pair (e.g., Tavakoli, Reference Tavakoli2016; Witton-Davies, Reference Witton-Davies2014). Self-repetitions were manually annotated in Praat by the first author. The fluency measurements were calculated based on the same procedure used with the monologic speech samples.
Fluency measures
The fluency measures and their operationalizations, including information about the fluency dimensions they represent, are compiled in Table 1.
Table 1. Fluency measures and their operationalizations used in the present study

The fluency measures presented in Table 1 were chosen based on previous studies on monologic and dialogic fluency (see, e.g., De Jong, Reference De Jong2018; Peltonen, Reference Peltonen2017, Reference Peltonen2020; Tavakoli, Reference Tavakoli2016; Witton-Davies, Reference Witton-Davies2014). Particular attention was paid to the comparability of the measures across the two modes. To offer as comprehensive view of speech fluency as possible, the measures were selected to represent all three fluency dimensions suggested by Skehan (Reference Skehan2009, Reference Skehan and Skehan2014), along with one widely used composite measure (speech rate, SR) that combines the speed and breakdown dimensions. Regarding the breakdown dimension, the silent pauses (SPs) were examined for three aspects: their frequency, average duration, and location (between-turn vs. within-turn SPs in the dialogues).
Statistical analysis
The statistical analyses were performed in R Studio (R Core Team, 2024). All numeric variables were checked for normality with Shapiro-Wilk tests and a visual inspection of histograms and ecdf plots. When predictor variables were non-normally distributed, they were logged. We made sure that all model assumptions were sufficiently met. We then analyzed our data using multiple linear regressions in a downwards hierarchical model selection process (Gries, Reference Gries2021) with the five L2FluencyMeasures as dependent variables and the corresponding L1FluencyMeasures, TaskMode, and LexTaleScore as the independent variables, allowing for two-way interactions between the variables. Separate models were run for the individual fluency variables. Predictors were deleted unless they significantly improved the models based on p-values. Model performance was assessed through a visual inspection of the residuals and Shapiro-Wilk tests for normal distribution of the residuals.
Results
Descriptive statistics
Before presenting the results of the regressions, descriptive information (M, SD, CI) for each fluency measure in the two languages (L1 Finnish, L2 English) and the two task modes (monologue, dialogue) is presented in Table 2.
Table 2. Descriptive statistics (M, SD, CI) for the fluency measures by language and task mode

As can be seen from Table 2, the performances were, on average, somewhat more fluent in the L1 Finnish data regarding articulation rate (AR) and SR (speed and composite dimensions), while the patterns are less clear for the breakdown (the two SP measures) and repair dimensions. In the following, we present the results of the regression analyses.
Speed fluency (AR)
The final model that predicted L2 AR in syllables per minute of speaking time was highly significant (p < .0001) and explained 54% of the variance of the data (multiple R2 = .55, adjusted R2 = .54, F(2, 82) = 54.24). It included two main effects, but the interaction between the two factors was not significant and therefore deleted from the model. The final model coefficients are illustrated in Table 3 and visualized in Figure 1.
Table 3. Coefficients of the final model predicting AR in the L2: AR_L2 ~ AR_L1 + TASKMODE

Note: *** p <.001, * p < .05.

Figure 1. Effects of the final model predicting AR in the L2.
L1 AR significantly predicted L2 AR (β = 0.50, SE = 0.05, t = 10.30, p < .001), indicating that speakers with faster articulation in their L1 also tended to articulate faster in the L2. Specifically, for each one-syllable increase in L1 AR, the L2 AR increased by approximately 0.50 syllables per minute of speaking time, suggesting a strong cross-linguistic consistency in articulation speed.
Task mode also significantly influenced L2 AR. Participants in the monologic condition spoke more slowly in the L2 compared to the dialogic condition (β = −7.17, SE = 3.53, t = −2.03, p = .045). This indicates that, holding L1 AR constant, performing in the monologic condition reduced L2 AR by approximately 7.17 syllables per minute of speaking time on average.
Breakdown fluency (SPs per minute and the mean length of SPs)
The second model predicted the number of SPs per minute in the L2 based on the aforementioned variables in a hierarchical top-down model selection process. The final model was highly significant (p < .0001) and performed very well by explaining 45% of the variance in the data (multiple R2 =.47, adjusted R2 = .45, F(2, 89) = 38.96). The final model consisted of two main effects without an interaction between the two, the main effects being the number of SPs per minute in the L1 and the task mode, as summarized in Table 4 and illustrated in Figure 2.
Table 4. Coefficients of the final model predicting SPs per minute in the L2 (Logged): SP_L2_logged ~ SP_L1_logged + TASKMODE

Note: *** p <.001, ** p < .01.

Figure 2. Effects of the final model predicting SPs per minute in the L2 (logged).
As Figure 2 illustrates, there was a significant positive effect of L1 SPs (β = 0.63, SE = 0.08, t = 7.62, p < .001). This indicates that individuals who produced more SPs in their L1 were also predicted to produce more in their L2. Specifically, a one-unit increase in the number of SPs in the L1 (logged) was associated with a 0.63 unit increase in the L2 (logged), suggesting a strong within-speaker correlation in pause behavior across the two languages.
Task mode also had a significant effect on L2 SP frequency (β = 0.15, SE = 0.06, t = 2.72, p = .008). Participants produced, on average, 0.15 (logged) units more L2 SPs per minute (which roughly transforms to 16% more) in the monologic condition than in the dialogic condition.
The second variable to be considered under this fluency dimension, the mean length of L2 SPs, revealed slightly different results. The final model revealed significant main effects of the L1 pause length (β = –1.62, p = .021) and L2 proficiency (β = –0.02, p = .004), as well as a significant interaction between the two (β = 0.03, p < .001), which is illustrated below in Table 5 and in Figure 3. The model was highly significant (p < .0001) and performed very well by explaining 61% of the variance in the data (multiple R2 = .62, adjusted R2 = .61, F(3, 88) = 47.72).
Table 5. Coefficients of the final model predicting mean length of SPs in the L2: SP_MEAN_L2 ~ SP_MEAN_L1 + LEXTALE + SP_MEAN_L1:LEXTALE

Note: *** p <.001, ** p < .01, * p <.05.

Figure 3. Effect of the final model predicting mean length of SPs in the L2.
Figure 3 illustrates that the connection between the L1 mean length of SPs and LexTale scores was generally positive for all learners, but it becomes much stronger with the learners’ increasing LexTale scores (β = 0.03, SE = 0.01, t = 3.43, p < .001). While the L1 effect is barely measurable for learners with a LexTale score of 60 (indicating level B2 in the CEFR), it becomes very strong for learners with scores of 80, 90, and 100 (indicating level C1/C2 in the CEFR).
Note that this time, task mode was not a significant predictor for the final model predicting mean SP length in the L2 and was therefore not included in the final model.
Composite fluency (SR)
The final model to predict L2 SR was also highly significant (p < .0001) and reached a comparatively high explanatory power with 50% (multiple R2 = .51, adjusted R2 = .50, F(2, 89) = 46.57). The final model included two main effects, i.e., SR in the L1 and task mode, which are illustrated below in Table 6 and in Figure 4.
Table 6. Coefficients of the final model predicting SR in the L2: SR_L2 ~ SR_L1 + TASKMODE

Note: *** p <.001, * p <.05.

Figure 4. Effects of the final model predicting SR in the L2.
The model revealed a significant positive effect of the L1 SR (β = 0.47, SE = 0.05, t = 9.00, p < .001), indicating that for each one-syllable increase in the L1 SR, the L2 SR increased by approximately 0.47 syllables per minute of total time, holding all else constant (see left panel). In contrast, performing in the monologic task mode was associated with a significant decrease in the outcome compared to the reference level (β = –10.58, SE = 4.45, t = –2.38, p = .020). This suggests that the participants’ L2 SR in the monologic task was predicted to be, on average, 10.58 syllables lower than that in the dialogic task mode (see right panel).
Repair fluency (self-repetitions per minute)
Regarding repair fluency, the final model showed the poorest performance of the five L2 fluency measures, by only explaining 8% of the variance in the data (p < .05) (multiple R2 = .10, adjusted R2 = .08, F(1, 46) = 5.33). The final model contains only one main effect, namely the repeats in the L1, which is illustrated in Table 7 and in Figure 5.
Table 7. Coefficients of the final model predicting self-repetitions per minute in the L2 (logged): REP_L2 ~ REP_L1

Note: *** p <.001, * p <.05.

Figure 5. Effect of the final model predicting self-repetitions per minute in the L2 (logged).
The regression revealed a significant positive effect of the mean number of repetitions in the L1 on the learners’ L2 behavior (β = 0.34, SE = 0.15, t = 2.31, p = .03). This indicates that for each one-unit increase in the L1 repetitions, the predicted L2 repetitions increased by approximately 0.34 units. In other words, a higher repetition rate in the L1 performance was predicted to be associated with a higher rate in the L2 performance.
Note that the task mode was not a significant predictor of L2 repair fluency and had to be deleted during the model selection process.
Discussion
Our research question addressed the extent to which L2 speech fluency variables could be predicted by L1 speech fluency measures and the task mode (monologue vs. dialogue) while controlling for the learners’ L2 proficiency levels, as measured by their LexTALE scores. Overall, the regression models showed strong predictive power for the two breakdown fluency measures (the number of L2 SPs per minute 45%; the mean length of L2 SPs 61%), the speed fluency measure (L2 AR; 54%), and the composite measure (L2 SR; 50%), while the model for L2 self-repetitions only explained 8%. This overall finding is in line with the bulk of L2 speech fluency research that has demonstrated clearer L1–L2 links, and impact of other variables, on speed and breakdown aspects of pausing, while the repair dimension is not equally well understood (see, e.g., Gao & Sun, Reference Gao and Sun2024; Peltonen, Reference Peltonen2020).
More specifically, L1 speech fluency demonstrated the strongest and most significant effects in all of our models, whereas task mode had comparatively weaker effects and was also excluded from the final model for L2 self-repetitions. In addition, L2 proficiency turned out to have a weaker effect than we had expected, only playing an interacting role in predicting the mean SP length in the L2.
As studies exploring both L1 speech fluency and task mode effects are rare, our study is among the first to show that while L1 fluency impacts all dimensions of fluency, the effect of task mode differs across different dimensions of fluency. The finding that L1 fluency measures can significantly predict the equivalent L2 speech fluency measures is in line with previous, mostly monologue-based L1–L2 fluency studies (e.g., De Jong et al., Reference De Jong, Groenhout, Schoonen and Hulstijn2015; De Jong & Mora, Reference De Jong and Mora2019; Gao & Sun, Reference Gao and Sun2023; Huensch & Tracy-Ventura, Reference Huensch and Tracy-Ventura2017; Jiránková et al., Reference Jiránková, Gráf, Kvítková, Abel, Glaznieks, Lyding and Nicolas2019, Kahng, Reference Kahng2020; Pérez Castillejo & Urzua-Parra, Reference Pérez Castillejo and Urzua-Parra2023), including studies examining the same combination of languages (L1 Finnish and L2 English) with younger learners at lower proficiency levels (e.g., Peltonen, Reference Peltonen2018) and with a similar population of advanced adult learners (Peltonen & Lintunen, Reference Peltonen and Lintunen2022).
The findings regarding task mode effects are somewhat more complex, as task mode showed the most prominent impact on the speed and composite dimensions, along with the SP frequency aspect of the breakdown fluency dimension, but not SP duration or the repair fluency dimension. Our findings are in line with Tavakoli’s (Reference Tavakoli2016) results regarding the speed and composite fluency measures, demonstrating more fluent performance in dialogue (faster AR and SR). In contrast, regarding the breakdown fluency dimension, while Tavakoli (Reference Tavakoli2016) found the SPs to be shorter in dialogue but no difference in the frequency of SPs across the two modes, in our study, SP frequency did differ across the two modes. The prominent role of L1 in affecting SP durations (see also, e.g., De Jong et al., Reference De Jong, Groenhout, Schoonen and Hulstijn2015; Gao & Sun, Reference Gao and Sun2024; Kahng, Reference Kahng2020), not examined in Tavakoli’s (Reference Tavakoli2016) study, might explain the differing findings regarding the breakdown fluency dimension, further highlighting the importance of exploring the effects of L1 speech fluency and task mode together.
Overall, our finding of three of the total five fluency measures examined in the present study indicating that the dialogic performances were more fluent than the monologic performances can be explained with the presence of the interlocutor and the overall interactive nature of the dialogic task: turn-taking provides opportunities for the interlocutor to plan during the other speaker’s turn, as has also been suggested in previous research (e.g., Michel, Reference Michel and Robinson2011; see also Lehtilä, Reference Lehtilä2021; Tavakoli, Reference Tavakoli2016; Witton-Davies, Reference Witton-Davies2014). This reduces the need to pause, especially within one’s own turn. Furthermore, the interlocutor can assist the speaker in maintaining fluency during problem-solving sequences, such as word searches (McCarthy, Reference McCarthy2010; Peltonen, Reference Peltonen2017, Reference Peltonen2020; Peltonen & Lintunen, Reference Peltonen and Lintunen2024), while in the monologic condition, similar assistance is not available.
Regarding the impact of L2 proficiency level, we only found significant proficiency effects (based on the participants’ LexTale scores) on the participants’ fluency performance for the average length of SPs in the L2. Here, we observed an interaction between the LexTale score and the L1 fluency measure, suggesting that the connection between L1 and L2 fluency was stronger at the upper levels compared to the lower levels. This finding is in line with some previous research conducted in a monologic context: for instance, Peltonen (Reference Peltonen2018) found stronger L1–L2 correlations for B2 level participants compared to B1 level participants and Peltonen and Lintunen (Reference Peltonen and Lintunen2022) demonstrated stronger correlations between L1 (Finnish) fluency and the more proficient additional language, English, compared to Swedish. In contrast, in an analysis also involving lower proficiency level participants (A2, B1, and B2), Duran-Karaoz and Tavakoli (Reference Duran-Karaoz and Tavakoli2020) did not find proficiency to have a mediating effect on L1–L2 correlations based on analyses of partial correlations. However, our results complement these previous correlational analyses by providing more robust and direct evidence for the L1–L2 link becoming stronger as the proficiency level increases regarding the average length of SPs, as our regression analyses demonstrated that the links in the average length of SPs were stronger at the C1/C2 level (scores of 80% or above in the CEFR) than the B2 level (scores between 60% and 80%).
While we could only find proficiency level effects on the L1–L2 connections for one of the fluency variables examined in our study, the potential impact of proficiency should not be dismissed. Our results suggest that proficiency level might only play a role for certain fluency variables, especially in interaction with other variables (notably L1 speech fluency), as has also been suggested in previous longitudinal examinations of L1–L2 fluency connections (Derwing et al., Reference Derwing, Munro, Thomson and Rossiter2009; Huensch & Tracy-Ventura, Reference Huensch and Tracy-Ventura2017). The fact that proficiency level effects were found only for one variable can be explained by the high overall proficiency level in the sample and its relative homogeneity (74% of the participants representing the C1/C2 level in the CEFR).
Conclusion and outlook
With the present study, we have contributed to the overall discussion around L2 fluency and the factors strongly influencing it. Adopting a multivariate approach to our data helped not only to explain the strong main effects (notably L1 speaking style and task mode), but also a comparatively marginal effect of the learners’ L2 proficiency level. Building on the findings of Gao and Sun’s (Reference Gao and Sun2024) recent meta-analysis of L1–L2 connections and the factors impacting this connection, our results have demonstrated the benefits of extending the research on L1–L2 connections in fluency from monologic to dialogic (peer interactional) contexts.
The main limitations of our study relate to the analyzed sample, as we only investigated advanced Finnish learners of English. This limits the generalizability of our findings and, ideally, in the future, our approach should be extended to learners from other L1s. While our approach was motivated by the fact that the two languages belong to two different language families, it could be interesting to examine whether the findings are generalizable to language pairs within the same language family or to other combinations of different language family pairs. Furthermore, L2 fluency among learners representing lower levels of proficiency should be examined in future research, as connections in L1 and L2 may be weaker for less proficient participants (see Peltonen, Reference Peltonen2018). Finally, while the choices for the monologic and dialogic task prompts were motivated by the respective research traditions on monologic and dialogic fluency, enabling high comparability with previous research, the present study should, ideally, be replicated with other monologic and dialogic task prompts to explore any potential task prompt and task design effects on the participants’ performance.
Despite the limitations, the novel combination of exploring both the impact of L1 speaking style effects and task mode on L2 speech fluency, along with the learners’ L2 proficiency level, has significant implications for L2 speech fluency research. As the studies in this field have, so far, mainly revealed effects of L1 speaking style in monologues, our findings complement and extend these previous findings by indicating that L1 speaking style also plays a role in L2 speech fluency in dialogic settings, even with potential alignment effects associated with interactive settings that could “mask” or influence this L1–L2 fluency connection (see also Gilabert et al., Reference Gilabert, Barón, Levkina and Robinson2011; Michel, Reference Michel and Robinson2011). Furthermore, as we opted for measurements including between-turn pauses rather than excluding them, it should be noted that this decision could have impacted the results. The effects of this methodological choice could be further examined in the future by incorporating both measurements that include and exclude between-pause measurements in a single study (see Tavakoli, Reference Tavakoli2016). Overall, our approach in examining the role of speaker’s L1 fluency performance in combination with the task mode can thus pave the way for further research in this area, especially in combination with relevant learner background variables (e.g., Götz, Reference Götz, Degand, Gilquin, Meurant and Simon2019b) and compared to alternative approaches of exploring these effects separately.
Competing interests
The authors declare none.
Author note
This study was funded by the Research Council of Finland (project Fluency and Disfluency Features in L2 Speech, decision number 331903, and project L2 Interactional Fluency across Contexts, decision number 363643).