Contract
Il Mulino - Rivisteweb
Xxxxxx Xx Xxxxxxxxx
On conversation
(doi: 10.1418/87004)
Lingue e linguaggio (ISSN 1720-9331) Fascicolo 1, gennaio-giugno 2017
Ente di afferenza:
()
Copyright ◯c by Societ`a editrice il Mulino, Bologna. Tutti i diritti sono riservati.
Per altre informazioni si veda xxxxx://xxx.xxxxxxxxxx.xx
Licenza d’uso
L’articolo `e messo a disposizione dell’utente in licenza per uso esclusivamente privato e personale, senza scopo di lucro e senza fini direttamente o indirettamente commerciali. Salvo quanto espressamente previsto dalla licenza d’uso Rivisteweb, `e fatto divieto di riprodurre, trasmettere, distribuire o altrimenti utilizzare l’articolo, per qualsiasi scopo o fine. Tutti i diritti sono riservati.
ON CONVERSATION
Xxxxxx Xx Xxxxxxxxx
ABSTRACT: What is the function of a dialog? Traditionally its function is to reach an agreement on a range of knowledge to be conveyed, in order to make this shared. The language, that is a grammar shared among the interlocutors, is just the main means by which this aim is pursued.
The paper investigates to what extent the metrics of speech can be induced by the pragmatic conditions of communication. The qualitative and statistic analysis demonstrates that rhythmical patterns vary according to the conversational goals of the speakers.
Results are compatible with a different answer to the question with which we began. The dialog function seems mainly to realize an activity together, and such final objectives are achieved through the instrument of a mutual understanding: in other words, the dialog seems above all a cooperation, which serves the language, but which does not aim for the language.
KEYWORDS: rhythm, conversation, synergy, pragmatics, Italian.
1. INTRODUCTION
According to some recent proposals (Xxxxxxxx et al. 2012; Xxxxxxxx, Xxxxxxxxx- Xxxxxxxx & Tylén 2014; Xxxxxxxx & Xxxxx 2015), we make a specific distinc- tion between classical and interpersonal approaches to the dialog. The classi- cal approaches to dialog are largely based on models of individual linguistic processing. According to these approaches, each interlocutor enters the con- versation as an independent entity, and brings his/her (linguistic and encyclo- pedic) competence in the dialog. In short, the conversation exists after the in- terlocutors, and it mirrors their competences. Thus, the interlocutors prelimi- narily have to share a common grammar, and that is the main reason why the dialog has been considered as a means for eliciting the grammar of a language through the so-called map-task method (cf. Xxxxx et al. 1983; Xxxxxxxx et al. 1991; Xxxxxxxx et al. 1995; Xxxxxxxx et al. 1997). According to these approaches, the function of the dialog is to make interlocutors reach a mutual understand- ing by means of a common grammar. The assumption is that the dialog is an interaction of independent and internally controlled individuals making the alignment of initially diverging situation models the key step for the success of their interaction. In addition, a second assumption is that the interaction is
147
LINGUE E LINGUAGGIO XVI.1 (2017) 147–171
even more happy as it brings to align the initial linguistic and cognitive xxx- xxxxxxxx of the interlocutors.
The main problem with the expectations of the classical approach is their incompatibility with two kinds of data: the stability or complementary nature of the dialog, and the synergic nature of the linguistic interaction between the interlocutors. As for the stability, many studies have revealed that the dialog is an instance of an orderly behavior among the parts of an assembly, in which the interlocutors realize their lexical, prosodic, and rhythmic (i.e. the distribu- tion of their pauses) choices in a complementary way. This stability of the dialog clashes with the expectation of a dynamics bringing the interlocutors from an initial misalignment to a final alignment by means of the dialog.
Likewise, for the synergy between the interlocutors, here we will show that even in the case of a conflictual conversation, the interlocutors recipro- cally align their linguistic behavior in terms of syllable and metrical feet du- rations (i.e. facts that are likely to be below their threshold of awareness), and they do that from the beginning of their interaction, up to contradicting the expected dynamic nature of the conversation.
According to the interpersonal approaches, language is a pre-eminent case of joint action (Xxxxx 1996), and the dialog is a functional synergy, that pushes the interlocutors to reciprocally align their linguistic behaviors and represen- tations on multiple levels. As for the function of such synergy, it responds to the environmental demands imposed on a dyad, to be in a joint task, to estab- lish or maintain the social relations inside the group. It is not easy to say for what purpose the individuals in a group need to cooperate, probably, this is because it is necessary to express their membership in a group and then estab- lish and maintain cohesive social relations. In any case, the dialog presents the same characteristics and functions of any other human action intended for a co-operative end; the interlocutors operate as the coordinated movements of the arm muscles that strike a hammer to drive a nail into the wall. The dialog seems to be predominantly a form of action, a special action because it is co- operative, but not different from those actions with a motor characteristic or with a social nature that are imposed on individuals as members of a group, and therefore because they need to express their membership using shared forms of acting. In short, according to this latter approach, the main reason why we engage a conversation is to realize an activity, and particularly an activity together, by means of a mutual understanding. The mutual under- standing is no more the main goal of the conversation, but it is a simple means through which the dyad reaches the actual goal of a conversation, and realizes a joint activity.
2. ORIGINS AND CHARACTERISTICS OF THE INTERPER- SONAL SYNERGETIC APPROACH
The notion of synergy originates from the study of movement, as a way of describing the functional coordination of multi-element systems (Xxxxxxxxx 1967; Xxxxxx 1977). The model originates from the motor control theory by Xxxxxxxxx. He found it unlikely that the central nervous system could be able
single muscle individually to create coherently directed movements. Rather,
- -organizing as-
grees of freedom, greatly reducing the amount of control needed. For instance, if we want to strike a chisel with a hammer, the variability of the trajectory of the tip of the hammer across a series of strikes is smaller than the variability of the trajectories of the individual joints on the hammering arm (Xxxxxxxxx 1967).
An example of such a synergy can be the motor coupling of two agents cooperating in order to achieve a common goal: if two agents are carrying a big piece of furniture and one of them raises one side too high up, the other, to keep the piece balanced, can compensate by raising the opposite side in proportional ways. In this sense, their actions come to complement each other, for example, adjusting in opposite directions, maintaining the furniture bal- anced. Similarly, interlocutors become interdependent in their linguistic be- havior.
A recent analysis of the trajectories of flocks of starlings may be another good example of a synergetic behavior driven by the positional adjacency: the birds located at the border of the flock trigger and lead the fluctuation of the formation. The analyses by Xxxxxxx Xxxxxx and his team (Xxxxxxxx et al. 2014a, 2014b) reveal that in flocks of starlings the velocity fluctuations of two distant birds mutually influence each other, and that the seemingly disorganized swarms have some sophisticated features: i.e. they are a self-organized (syn- ergic) whole, and coordinate their behavior so as to optimize their ability to react collectively (e.g. to avoid predators or to attract sexual partners).
3. PREVIOUS WORKS
The interpersonal approach has been developed in recent works (Xxxxxxxxxx et al. 2011; Xxxxxxxx et al. 2012; Xxxxxxxx, Xxxxxxxxx-Xxxxxxxx & Tylén 2014; Xxxxxxxx & Tylén 2015).
-walking ritual (in San Xxxxx Xxxxxxxx), and give an account for the so-called collective arousal ef- fect. They measure heart rates (from active participants and spectators) during
-walking ritual. Thus they investigate the synchronized arousal instead of the synchronized body movements. They also account sep- arately for the measures referring to spectators that are related to the fire-walk- ers, and spectators that are not (observers). They find that the same heart pace pattern occurs for both the fire-walkers and their related spectators, whose heart rates peaked for the walk of their relatives and friends. But this pattern is not shared with the unrelated observers. Thus what these measures show is
-walk, but that there are
-walking ritual across all of the
-walks, between performers and spectators that are related or tangentially related. Moreover, they show that there is a social catalyzer that promotes a shared pattern of arousal inside the borders of a given social group, whereas it inhibits the same sharing across a social border. A dialogue is a special kind of social catalyzer; by means of the dialogue people share and coordinate their behavior, in order to carry out a synergy.
Xxxxxxxx et al. (2012) conduct an experiment concerning the copy effects between the speech turns in a conversation and the benefits it brings to solving a perceptual problem. The experimental expectation is that the occurrence of the copy effects (at lexical, prosodic, and rhythmic levels) is compatible only with the model of the dialog as an interpersonal synergy.
lexical choices (Xxxxxxxx et al. 2012; Xxxxxx & Xxxxxxxx 1987; Xxxxxx & Xxxxxxxxxx 1966), prosodic patterns (Xxxxxxxx & Xxxxxx 2009; Xxxxxxx & Hirschberg 2011; Xxxxx 2006; Xxxxx et al. 2012), and rhythms of speech and pauses (McFarland 2001; Wilson & Wilson 2005). According to these find- ings, the conversation is a coherent, co-constructed text to assess structures of recurrence of prosody, speech/pause, and lexicon. In their experiment dyads of participants cooperate linguistically to solve a perceptual task (the partici- pants have to individually indicate which of the 2 displays contains a contrast oddball) and then they have to discuss about it. The experimental results con- firm that the degree of global linguistic convergence of confidence expres-
4. RHYTHM MEASURES
According to the interpersonal and synergetic approach to the conversation, we present some experimental verifications which show how and to what ex- tent the speech rhythm can be induced and shared by pragmatic conditions, and by the interpersonal synergy in a conversation.
Two main approaches to linguistic rhythm exist in the literature: the hy- pothesis of rhythmic discrete types and the assumption of rhythm as a variable property that does not belong to the linguistic system, but to conversational interaction. In this latter approach the rhythm function is to handle cooperation and conflict among the speakers. Therefore it is not stable, but varies accord- ing to its conversational functions.
4.1 Rhythm-property of the system
The hypothesis of rhythmic types goes back to the forties (Xxxxx Xxxxx 1940; Xxxx 1945; Abercrombie 1967; Xxxxx, Xxxxx & Xxxxxxxxxxx 1980; Dauer 1983). It mainly consists of a binary classification (syllable-timed/stress-timed languages1). But it has not yet been clearly experimentally validated (e.g. Shen & Xxx 1974; Le-
histe 1977; Xxxxxxx & Xxxxxx 1979; Xxxxx 1982; Xxxx & Xxxxxxxx 1982; Borzone de Xxxxxxxx & Xxxxxxxxx 1983; Dauer 1983; Xxxxx & Xxxxxx 1993). According to a weaker hypothesis, rhythm is a perceptual impression arising from the convergence of some clusters of phonological properties typical of a given language (e.g. Xxxxxx & Xxxxxxxx 1982; Xxxxxx & Xxxxx 1986; Dauer 1987; Xxxxxxxxxx 1981, 1989; Xxxxxx 1990; Xxxxx, Xxxxxx & Xxxxxx 1999). The linguistic typology (syllable/stress-timed) is not discrete and different systems are spread out over a continuum.
Also according to the PVI hypothesis (Xxx & Xxxxx 1995; Xxx, Xxxxx & Xxxxx 2000; Xxxxx & Xxx 2002; Xxxxx & Xxxxxxx 2003), rhythm is an intrinsic property of the system.
4.2 Rhythm-variable property of conversation
This hypothesis derives from conversational analysis studies, and represents rhythmic features in Gestalt terms. Recently, a new impetus has been given by the so-called phonetic-details studies (cf. Xxxxx, Xxxxxxxxx & Xxxxxxxxx 1974; Erickson 1982; Erickson & Xxxxxx 1982; Xxxxxx 1991; Couper-Kulhen
1989, 1990, 1993, 2001; Buder 1986, 1991, 1996; Auer et al. 1999; Buder &
1 In syllable-timed languages the duration of every syllable is equal. In stress-timed languages the interval between two stressed syllables is equal.
Xxxxxxxx 1997, 1999; Local 2003; Fon 2006; House 2007; Xxxxx & Xxxxx 2008; Arvaniti 2009; Xxxx 2010). In this paradigm, during interaction, rhythm may vary due to the conversational tasks, it is not a property of the system, but a tactical resource of the speaker.
To this paradigm three more branches belong: the studies on the metrical feet variability (heterometry) (Xxxxx & Xxxxxxxx 2010); the studies on rhythm as an entrainment phenomenon (Cummins & Port 1998; Port 2003; Cummins 2009); the studies of rhythm as an Adaptive Oscillator (Port, Cummins & Gasser 1996).
5. EXPERIMENTAL ANALYSIS
Before starting the main analysis of the polemical corpus, we ran a preliminary experimental test in order to verify to what extent the speech metrics can be induced by some pragmatic conditions. Two corpora have been elicited. The first corpus was obtained by an experimental collaborative task in which the subjects were asked to synchronize their speech with a recorded one. The sec- ond is a natural corpus in which two speakers are engaged in a polemical in- teraction (the so-called quarrel between Xxxxxxxx Xxxxxx and Xxxx Xxxxxxxxx during the Italian TV show Telemike in 1991). In each of the two corpora we
poral distance between the stressed syllables); the syllabic intervals (hence-
urements were used to check the metrical typology (stress/syllable-timing) and its variation along the corpus.
5.1 Experiment on the collaborative corpus
We recorded the sentence Il capostazione ha spento la luce ( The station mas- ter has turned the lights out ). The rhythm of the original signal (A) is not syllable-timed, nor stress-timed (the duration of Acc in the original signal A is not constant, and the same goes for the duration of Xxx). Then we manipu- lated the signal A in order to build new ones with constant Syl or Acc (signals D and E). On these signals we built a Listen & Repeat test. The working hy- pothesis was that listening to these signals (B, C, D, E) will induce the listener to a syllable-timing or a stress-timing rhythm, according to the manipulated signals. To the same purpose, before the original signal A, we inserted three beeps,2 150 ms apart (equal to the mean Syl in the original signal) and 496 ms
2 At least three evenly spaced beats are required in order to establish an isochronous chain (Couper-Kuhlen 1990: 16).
apart (equal to the mean Acc in the original signal). Thus we produced the signals B and C. We also manipulated the duration of Xxx and Acc in order to equalize their duration respectively to 150 ms and to 496 ms. Thus we pro- duced the signals E and D. Both mean values (150 ms and 497 ms) are only artificial targets. They do not represent an actual target rhythm, or a natural
thm by a subject. The signals for this listening test (PASSIVE CORPUS) are listed in Table 1.
SIGNAL | |
PASSIVE CORPUS | |
A | |
B | three beeps (150 ms apart) + signal A |
C | three beeps (496 ms apart) + signal A |
D | signal A: equalized Acc durations: 496 ms |
E | signal A: equalized Syl durations: 150 ms |
ACTIVE CORPUS | |
1 | |
2 | signal recorded after listening to the signal B |
3 | signal recorded after listening to the signal C |
4 | signal recorded after listening to the signal D |
5 | signal recorded after listening to the signal E |
Table 1. Passive and active corpora.
The ACTIVE CORPUS is composed of the sentences that the subjects recorded after listening to the passive corpus.
Five university students took part in the experiment: S1 (male, age 52, born in Rome where he lives); S2 (female, age 24, born in Terni where she lives); S3 (female, age 51, born in Orune-Nu, but living in Tuscania-Vt); S4 (female, age 19, born in Alatri-Fr where she lives), S5 (female, age 19, born in Civitavecchia where she lives). They listen to signal A, and are asked to utter the same sentence into the microphone (signal 1 of the active corpus). Then they listen to the further 4 signals and repeat them into the microphone, trying to imitate them and keep as close to the same timing of the signal they listened to with headphones. Thus, the ACTIVE CORPUS contains 25 signals (5 per subject), as shown in Table 1.
5.2 Experimental expectations and results
Compared to signal 1 (recorded at the beginning of the session), signals 4 and 3 should show equalized Acc durations and close to 496 ms (but the absolute value depends on the speaking rate). Signals 5 and 2, should show equalized Syl durations and close to 150 ms (but the absolute value depends on the
speaking rate). If these expectations are confirmed, then the syllable-timed or stress-timed rhythm is an effect of the communicative interaction. As we are able to induce, it is not a property of the linguistic system. The results validate these expectations. Table 2 shows the Syl and Acc durations in the active cor- pus, and their standard deviation ( ). Signals 2 and 5 systematically approach the reference value (150 ms) as compared to signal 1; likewise, 3 and 4 sys- tematically approach the reference value (496 ms). Therefore the decreases.
SYLLABLES | |||||||
INTERSTRESS INTERVALS | |||||||
S1 | 402.3 | 78.61 | 476.0 | 76.21 | 474.0 | 53.69 | |
S2 | 620.0 | 253.74 | 585.3 | 110.88 | 550.3 | 152.16 | |
S3 | 560.6 | 232.02 | 493.0 | 124.38 | 500.3 | 87.06 | |
S4 | 510.6 | 115.86 | 444.0 | 46.16 | 477.0 | 18.68 | |
S5 | 418.3 | 65.54 | 477.3 | 29.29 | 479.6 | 23.09 |
Table 2. Mean Syl/Acc durations (ms) and their standard deviation ( ).
5.3 Experiment on the polemical corpus
As for the polemical corpus it is the quarrel between two Italian TV showmen: Xxxx Xxxxxxxxx and Xxxxxxxx Xxxxxx (Telemike in 1991). It was downloaded from YouTube. Its low audio quality creates no problem with the duration measurements. However, the low quality of that material, the occurrence of several overlappings, and the compressed format of YouTube audio (mp3) im- pede any spectral analysis or any extraction of F0. The polemical corpus is a communicative situation where the speakers do not collaborate, but manage to hinder and sabotage each other. A typical expression of this hostile intent is represented by repetitions, interruptions, and overlappings. The audio re- cording has been transcribed (both in IPA, and in orthographic layout); then we extracted the sequences of speech turns containing overlappings, repeti- tions, interruptions, and among them we selected those which could be ana- lyzed (the choice was made taking into consideration the sequences without environmental noises, clappings, yells, etc.); these sequences of speech turns are called chains. There are 23 chains (1-23) in the quarrel. Each chain repre-
sents a domain of at least one speech turn, and includes at least an overlap- ping3, and/or a repetition, and/or an interruption.
(1) | Xxxx: | questi son affari loro sono costruite |
Sgarbi: | no sono sono affari | |
(2) | Xxxx: | costruite |
Sgarbi: | sono sono affari tuoi | |
(3) | Sgarbi: | tu dove abiti? |
(4) | Sgarbi: | |
(5) | Sgarbi: | i poveri devono stare in case brutte? |
(6) | Sgarbi: | a zaffran [Xxxxxxxxx] etnea? |
(7) | Sgarbi: | |
(8) | Sgarbi: | |
(9) | Sgarbi: | di quelli che voglion i poveri nel brutto ) |
(10) | Xxxx: | allora so |
(11) | Sgarbi: | |
(12) | Sgarbi: | ti voglio dire quello che ho detto io ( |
(13) | Xxxx: | posso dire due parole? |
Sgarbi: | no non puoi | |
(14) | Xxxx: | parole words? |
Sgarbi: | no non puoi dirle perché dici delle cazzate | |
(15) | Sgarbi: | questo è il concetto |
3 Overlapping turn-taking is represented by columned text.
this is the idea
(16) Xxxx: dici tu, va bene? io non dic nessuna cazzata
Sgarbi: insieme a dirlo io
(17) Xxxx: adesso parlo io sì adesso mettila giù
Sgarbi: ora litiga vuoi far a pugni con me no puoi parlare no
now fight! do you want
(18) Xxxx: no perché tu devi smettere
anything
Sgarbi: voglio vuoi fare una rissa no
(19) Xxxx: smetti
Sgarbi: tu stai dicendo delle cose che non sai
(20) Xxxx: io non ho ancora io non ho
Sgarbi: quelli che parlano a vanvera è andato
(21) Xxxx:
(22) Xxxx: parli sempre taci qualche volta
Sgarbi: mi fai parlare mi fai parlare
(23) Xxxx: taci qualche volta
The signals are annotated by means of 6 Praat Tiers (see Figures 1-4 and 6), as follows: (1) orthographic transcription: Xxxx, (2) orthographic
IPA and boundaries.
5.4 Experimental expectations and results
In the polemical corpus we expect a minimal degree of rhythmic integration:
i.e. anisochrony. The results confirm these expectations. Indeed, no stable
rhythmic pattern exists. Furthermore, the metric of each turn changes accord- ing to the conversational purposes; in particular, the speaker may borrow his
or implement the opposite (for instance a syllable- timed vs. a stress-timed rhythm) in order to dominate by cutting or easing
rical types in each speech turn is a function of the conversational strategy of the speaker to create dominance.
Strat- EGY | Chain | Xxxx | Xxxxxx | Sylld | Sylld | Accd | Accd | T | ANO |
1 | A | 151.5 | 52.15 | ||||||
2 | S | 110.6 | 272 | 107.25 | |||||
3 | S | 440.5 | 228.39 | ||||||
4 | A | 100.8 | 44.25 | ||||||
5 | A | 108.9 | 52.79 | ||||||
6 | S | 244 | 74.95 | ||||||
7 | S | 478.5 | 144.95 | ||||||
8 | S | 467 | 224.86 | ||||||
9 | A | 150.1 | 85.41 | ||||||
10 | S | 1 stress | -- | ||||||
11 | A | 187.8 | 106.39 | ||||||
12 | A | 77 | 29.17 | ||||||
13 | S | 297.3 | 35.83 | ||||||
14 | * | ||||||||
15 | * | ||||||||
16 | Ac | Ac | 180.4 144.3 | 43.8 45.7 | |||||
17 | * | * | 126.8 128.3 | 56.5 80.7 | 47.7 102.5 | ||||
18 | A | 168.2 | 63.38 | ||||||
19 | S | 302.7 | 166.44 | ||||||
20 | Sc | Sc | |||||||
21 | A | 168.1 | 73.82 | ||||||
22 | A | 195.4 | 62.51 | ||||||
23 | S | 362 | 76.36 |
Table 3:
O
As is shown in Table 3, at the beginning both speakers alternate different metrics, in a
follows: Xxxxxx tries to disrupt the metrical strategy of Xxxx, using an asynchronous rhythm. Then, both speakers resume their confrontation, but change their tactics: there is the first instance of speech turns overlapping
-timed and synchronized.
quarrellers are completely asynchronous, but dynamically tuned. Then, there are -timed trend chain by Xxxx and a
following one by Xxxxxx, showing an opposed syllable-timed trend. Then, the to the previous one: there is a third turns overlapping with a common syllable-
alternate rhythmical trend: stress- and syllable-timed. Four examples of these turns are given below.
1. TRUCE-RHYTHM (Table 3: chains 14-15; Figures 1-2). Sgarbi shows no rhythmic isochrony: an extreme case of polemical strategy (maybe in order to
values are very
high both for the Acc and the Syl mean durations.
2. MIMETIC-RHYTHM (Table 3: chain 16; Figure 3). It is a first overlapping where both speakers tend to have a common stress-timed rhythm, and synchronized interstress boundaries: the difference between their mean Acc duration is not significantly different, a
and Anova tests (0.08>0.05).
3. ROLLING-RHYTHM (Table 3: chain 17; Figure 4), a second overlapping. The turns are anisochronous (the difference between their mean Acc duration
- 0.01<0.05 - and Anova tests - 0.02<0.05), but with a peculiarity: both speakers undertake a sort of
oth quarrellers is dynamically tuned, i.e. each turn takes up the Acc durational trend towards the increasing or decreasing of the previous one, uttered by the interlocutor. As you see in Figure 5, Xxxxxx produces a sequence of three increasing intervals (62-190- 388 ms), followed by a reply by Xxxx with three equally increasing intervals (000-000-000 ms); then, Xxxxxx reverses the trend, realizing a 170 ms interval and Xxxx pursues the decreasing trend with a 291 ms interval. Finally, Sgarbi reverses the trend again and produces a 236 ms interval, and Xxxx replies with the same increasing trend (365 ms).
4. MIMETIC-RHYTHM (Table 3: chain 20; Figure 6). In this third overlapping, both speakers tend to have a common syllable-timed rhythm and
they even tend to overlap, to make isotopic (synchronized) their syllabic boundaries: the difference between their mean Syl is not significantly
6. DISCUSSION
The rhythmic strategies are particularly interesting, because they can be ar-
of the preliminary experiment on the collaborative corpus simply show that the rhythmic patterns can be induced, and are not (or, at least, not entirely)
and performance. The results of the main experiment on the polemical natural
rhythm enters a sort of synergy. In this synergy the implementation of a given rhythm depends on the conversational strategy of each speaker, and on the demand to carry out a joint activity with his interlocutor. These findings are highly compatible with the model of conversation as an interpersonal synergy. And this synergy is also acting during a quarrel, that is in a conversation where the goal is not to reach a common understanding, not to share or transfer some information, but to overpower, to make the interlocutor unable to react.
Moreover, the data bring some consequences concerning the role of the conversation as a reliable means to elicit the grammar of a spoken language.
reconstruction of the linguistic system of an oral language. In fact, the conversation has been assumed to be representative of a (semi) spontaneous linguistic behavior, such as the linguistic interaction between two (or more) speakers engaged in a dialog. But, according to our findings, conversation is actually not a good witness of the grammar the speakers are using, or at least
performance during a conversation at least for their rhythmic choices is not grammar-driven, but action-driven: they assume a rhythm or another according not to the language (and the grammar) they are speaking, but
synergy. In short, conversation is a kind of joint action, locally motivated, more than an example of a particular realization of an abstract linguistic system, shared by both talkers.4
4 The grammar of a spoken language is not entirely the same thing as the rhythm of that lan- guage, but the rhythm is one of the main components of its prosody, and thus of its grammar. Prosody is a distinctive constituent of speech, and does not occur in written texts. Thus the rhythm and the prosody mainly shape the grammar of a spoken language.
One more consideration can be added. We argued before that a problem with the traditional approach to the dialog is that it predicts that the linguistic alignment between the interlocutors should be dynamic, that is it should not appear at the very beginning of a conversation. Whereas the interpersonal approach predicts that the alignment should happen all over the dialog. Our data seem not to fulfill the expectation of the interpersonal approach, because in the quarrel between Xxxx and Xxxxxx (polemical corpus) the strong alignment happens in the middle of the conversation, whereas at the beginning and at the end each quarreler shows a personal and unshared rhythmic behavior: Xxxx starts with stress-timed rhythmic trend, and Xxxxxx with a syllable-timed one; and Xxxx ends with the same stress-timing; in the middle of the quarrel-conversation various types of rhythmic alignments happen. But these apparent falsifications of the model can be interpreted as rhythmic events that serve as markers of the edges of a dialog. The reason why a dialog needs to be segregated from the outside, and delimited by an initial and a final border is that the dialog is a kind of ritual act. As every social routine, the dialog can be understood as a social plot in which agents (interlocutors) need to enter and leave. All stages of this plot comply with a morphology and syntagmatics, in the sense of Xxxxx (1928). Thus the beginning and the end of the conversation, its edges, provide a structural path to and from the plot. In the beginning of the conversation-quarrel between Xxxx and Xxxxxx each interlocutor keeps his rhythmic identity (Xxxx shows a stress-timed identity, and Xxxxxx a syllable-timed isochrony), then both lose their identities and becomes interdependent in a new synergy, a coherent, co-constructed new entity: the conversation. At the end of the dialog, both interlocutors have to leave the dyad and come back to their previous identities: so Xxxx ends the conversation putting on the same stress-timed rhythm he was using before entering the dialog.
Xxxx: parole | • | ||||||||||
Sgarbi: no non puoi | dirle | perché | dici | delle | caz | ||||||
pa | le | ||||||||||
non | j | le | per | e | t i | le | ka | ||||
e | t | lleka |
Figure 1. Sgarbi: NO NON PUOI DIRLE PERCHÉ DICI DELLE CAZZ no, you can t tattle this as you talk bullshit
Sgarbi: questo | è il | concetto | ||
s | kon | t | to | |
k | st | tto |
FIGURE 2: SGARBI: QUESTO È IL CONCETTO THIS IS THE IDEA
Xxxx: dici | tu | va bene | io | non | dic | nessuna | caz | ||||||||||
S | garbi: | sì | le diciamo | insieme | siam | insiem | a | dirlo | io | ||||||||
t i | va | ne | non | kne | na | kats | |||||||||||
da | ldi | ja | min | me | sja | mi | ma | lo | o | ||||||||
it | |||||||||||||||||
s | j | sj | mad |
FIGURE 3: XXXX: DICI TU. VA BENE. IO NON DICO NESSUNA CAZ. OU SAY THAT. OK. I DON T TALK BULLSHIT SGARBI: SÌ D ACCORDO, LE DICIAMO IN- SIEME, SIAMO INSIEME A DIRLO..IO ES, OK, WE TALK AT ONCE. WE TALK AT ONCE,
Sgarbi: ora
litiga
jparl
pw
ikomm
jf
ssom
re
par
j
kom
i
rap
j
ga
ti
ra
u
la
ti
t
so
s
a
si
o
lo
so
a
no
parlare
puoi
no
con me
pugni
a
far
vuoi
mettila giù
adesso
sì
io
parlo
Xxxx: adesso
ttilad
FIGURE 4: XXXX: ADESSO PARLO IO. SÌ ADESSO METTILA GIÙ O IT S MY TURN. YES NOW PUT IT DOWN SGARBI: ORA LITIGHIAMO. VUOI FARE A PUGNI CON ME? NO PUOI PARLARE. NO OW WE ARE GOING TO HAVE A QUARREL. DO YOU WANT TO BOX ME? NO YOU CAN TALK. NO
Figure 5: Rolling-rhythm. Variation of interstress intervals duration by Xxxx and Xxxxxx, arranged in sequence.
Xxxx: io | non | ho | ancora | io | non | ho | ||||||||
Sgarbi: quelli | che | parlano | a | vanvera | è | andato | ||||||||
o | non | a | ra | o | n | n | ||||||||
l | li | ke | lno | a | ve | ra | an | to | ||||||
a k | ra | n | ||||||||||||
kw | llik p |
FIGURE 6: XXXX: IO NON HO ANCORA.. IO NON HO HAVE NOT YET .. I HAVE NOT SGARBI: QUELLI CHE PARLANO A VANVERA. È ANDATO HOSE WHO XXXXXXX. IT S GONE
REFERENCES
Abercrombie, D. (1967). Elements of general phonetics. Edinburgh: Edinburgh Uni- versity Press.
Xxxxxxxx, X., X. Xxxxx, X. Xxxxxx Xxxx, X. Xxxxx, X. Xxxxxxx, X. Xxxxxx, X. Xxxxx,
X. Xxxxxx, X. XxXxxxxxxx, X. Xxxxxx, X. Xxxxxxx, X. Xxxxxxxx & X. Xxxxxxx (1991). The HCRC map task corpus. Language and Speech 34(4). 351-366.
Xxxxxxxx, X. (2009). Rhythm, timing, and the timing of rhythm. Phonetica 66(1-2).
46-63.
Xxxxxxxx, A., X. Xxxxxxx, X. Xxx Xxxxxxxx, I. Xxxxxxxx, X. Xxxxxxx, X. Xxxxxx, X. Xxxx,
X. Xxxxxxx, X. Xxxx, X. Xxxxxxxxx & X. Xxxxx (2014a). Finite-Size Scaling as a Way to Probe Near-Criticality in Natural Swarms. Physical Review Letters 113. 238102.
Xxxxxxxx, A., X. Xxxxxxx, X. Xxx Xxxxxxxx, I. Xxxxxxxx, X. Xxxxxxx, X. Xxxxxx, X. Xxxx,
X. Xxxxxxx, X. Xxxx, X. Xxxxxxxxx & X. Xxxxx (2014b). Collective Behaviour with- out Collective Order in Wild Swarms of Midges. PLoS Computational Biology 10(7).
Xxxx, P., X. Xxxxxx-Kuhlen & X. Xxxxxx (1999). Language in Time. New York: Ox- ford University Press.
Xxxxxxxxx, X.X. (1967). Coordination and regulation of movement. NewYork: Perga- mon Press.
Xxxxxxxxxx, P.M. (1981). Strutture prosodiche dell'italiano. Accento, quantità, sillaba, giuntura, fondamenti metrici. Firenze: Accademia della Crusca.
Xxxxxxxxxx, P.M. -
Revue de Phonétique Appliquée 91/93. 99-130.
Xxxxxxxx, X. (1965). Pitch accent and sentence rhythm. In I. Abe & T. Xxxxxxxx (eds.), Forms of English: Accent, Morpheme, Order, 139-180. Cambridge MA: Harvard University Press.
Borzone de Xxxxxxxx, A.M. & X. Xxxxxxxxx (1983). Segmental durations and the rhythm in Spanish. Journal of Phonetics 11. 117-128.
Xxxxx, X., X. Xxxxxxxx, X. Xxxxxxxxx, & X. Xxxx (1983). Teaching talk: strategies for production and assessment. Cambridge: Cambridge University Press.
Xxxxx, X. & The changing rhythms of
speech. In Xxxxxxxx-Xxxxxxx, Xxxx et al. (eds.), Proceedings of the Fifth Inter- national Conference on Speech Prosody, paper 074. Chicago: ISCA.
Xxxxx, X.X. (1986). Coherence of speech rhythms in conversations: Autocorrelation analysis of fundamental voice frequency. Toronto: Toronto Semiotic Circle.
Xxxxx, X.X. (1991). Vocal synchrony in conversations: spectral analysis of funda- mental voice frequency. PhD dissertation, University of Wisconsin, Madison, WI. Xxxxx, X.X. (1996). Dynamics of speech processes in dyadic interaction. In J.H. Watt & C.A. XxxXxxx (eds.), Dynamic Patterns in Communication Processes, 000-
000. Xxxxxxxx Xxxx, XX: Sage.
Xxxxx, X.X. & X. Xxxxxxxx (1997). Prosodic cycles and interpersonal synchrony in American English and Swedish. In X. Xxxxxxxxxx, X. Xxxxxxxxx, X. Xxxxxxxx (eds.), , vol. 1, 235-238. Grenoble: European Speech Comm. Association.
Xxxxx, X.X. & X. Xxxxxxxx (1999). Time-series analysis of conversational prosody for the identification of rhythmic units. In J. J. Xxxxx, X. Xxxxxxxx, X. Xxxxx, X. Xxxxxxxxx, A. C. Xxxxxx (eds.), Proceedings of the 14th International Congress of Phonetic Sciences, vol. 2, 1071-1074. San Xxxxxxxxx: Univ. of California.
Xxxxxxxx X., X. Xxxxx, X. Xxxxx, X. Xxxxxx, X. Xxxxxxx-Xxxxxxx, & X. Xxxxxxxx (1995). The coding of dialogue structure in a corpus. In J. A. Xxxxxxxxx, Xxxx X. xxx xx Xxxxx & Xxxxxx X. xxx xxx Xxxxxx (eds.), Proceedings of the Xxxxx Xxxxxx Work- shop on Language Technology: corpus-based approaches to dialogue modelling, 25-34. Enschede: Universiteit Twente.
Xxxxxxxx, X., X. Xxxxx, X. Xxxxx, X. Xxxxxx, X. Xxxxxxx-Xxxxxxx & X. Xxxxxxxx (1997). The reliability of a dialogue structure coding scheme. Journal of Computational Linguistics 23(l). 13-31.
Xxxxx, X.X. (1996). Using language. Cambridge: Cambridge University Press. Couper-Kulhen, E. (1989). Speech rhythm at turn transitions: its functioning in eve-
ryday conversation. Part I, XxxxXX, Working Paper No. 5. Konstance: Fachgruppe Sprachwissenschaft, University of Konstance.
Couper-Kulhen, E. (1990). Discovering rhythm in conversational English: Perceptual and acoustic approaches to the analysis of isochrony, KontRI, Working Paper No.
13. Konstance: Fachgruppe Sprachwissenschaft, University of Konstance. Couper-Kulhen, E. (1993). English Speech Rhythm: Form and Function in Everyday
Verbal Interaction. Amsterdam: Benjamins.
Couper-Kulhen, E. (2001). Intonation and Discourse: Current Views from Within. In
X. Xxxxxxxxx, X. Xxxxxx, & H. Xxxxxxxxxxx Xxxxxxxx (eds.), The Handbook of Discourse Analysis, 13-34. Oxford: Xxxxxxxxx.
Xxxxxxx, X. (2009). Rhythm as Entrainment: The Case of Synchronous Speech.
Journal of Phonetics 37(1). 16-28.
Xxxxxxx, X. & R.F. Port (1998). Rhythmic constraints on stress timing in English.
Journal of Phonetics 26(2). 145-171.
Xxxxxx, X. (1991). Prosody in situations of communication: salience and segmentation. In Proceedings of the 12th International Congress of Phonetic Sciences, vol. 1, 264-270. Aix-en-Provence.
Xxxxxx, X. & X. Xxxxxxxx (1982). On pre-accentual lengthening. Journal of the Inter- national Phonetic Association 12. 58-69.
Xxxxx, X.X. (1983). Stress-timing and Syllable-timing Reanalyzed. Journal of Pho- netics 11. 51-62.
Xxxxx, X.X. (1987). Phonetic and Phonological Components of Language Rhythm. In Proceedings of the 11th International Congress of Phonetic Sciences, vol. 5, 447-450. Tallinn: Academy of Sciences.
Xxxxxxx, X. & Xx. X. Xxxxxx (1979). The perceived rhythm of speech. In Proceed- ings of the 9th International Congress of Phonetic Sciences, vol. 2, 268-274. Co- penhagen: Institute of Phonetics.
Xxxxx, X. & X. Xxxxxx (1993). Accent structure in music performance. Music Percep- tion 10(3). 343-378.
Xxxxxxxx, X. (1982). Money tree, lasagna bush, salt and pepper: Social construction of topical cohesion in a conversation among Italian Americans. In X. Xxxxxx (ed.), Analyzing discourse: Text and talk, 43-70. Washington, D.C.: Xxxxxxxxxx Uni- versity Press.
Erickson, X. & X. Xxxxxx (1982). The Counselor as Gate Keeper: Social Interaction in Interviews. New York: Academic Press.
Xxxxx, X., D.J. Xxxxx & X. Xxxxxxxxxxx (1980). Rhythm in English: Isochronism, pitch and perceived stress. In L.R. Xxxxx, & X.X. xxx Xxxxxxxxxxxx (eds.), The Xxx- xxx of Language, 71-79. Baltimore: University Press Park.
Xxx, X. (2006). Cross-dialectal turn exchange rhythm in English interviews. In X. Xxxxxxx, & X. Xxxxxxxx (eds.), Proceedings of the 3rd Speech Prosody Confer- ence, PS4-16-179. Dresden: TUDpress.
Xxxxxxxx, X., X. Xxxxxxx, X. Xxxxx, X. Xxxxxxxxxx, X. Xxxx, Xx. Xxxxx & K. Tylén (2012). Coming to Terms: Quantifying the Benefits of Linguistic Coordination. Psychological Science 23(8). 931-939.
Fus -Leonardi & X. Tylén (2014). Dialog as interpersonal syn- xxxx. New Ideas in Psychology 32. 147-157.
Xxxxxxxx X. & K. Tylén (2015). Investigating Conversational Dynamics: Interactive Alignment, Interpersonal Synergy, and Collective Task Performance. Cognitive Science. 1 27.
Xxxxxx, X.& Xxxxxxxx, A. (1987). Saying what you mean in dialogue: A study in conceptual and semantic co-ordination. Cognition 27. 181-218.
Xxxxx, X. & E.L. Low (2002). Durational Variability in Speech and the Rhythm Class Hypothesis. In X. Xxxxxxxxxxx & X. Xxxxxx (eds.), Papers in Laboratory Pho- nology 7. The Hague: Xxxxxx de Gruyter. xxxx://xxx.xxxx.xx.xx.xx/xxx- ther/ivyweb/ Grabe_Low.doc.
Xxxxx, X. (2007). The role of prosody in constraining context selection: a procedural approach. Nouveaux cahiers de linguistique française 28. 369-383.
Xxxxxxxxxx, I., D. Xygalatas, X. Xxxxxxxx, X. Xxxxxxx, X.-X. Xxxxxxx, X. Xxxxxx, X. Xxx Orden & X. Xxxxxxxxxx (2011). Synchronized arousal between performers and related spectators in a fire-walking ritual. Proceedings of the National Acad- emy of Sciences 108. 8514-8519.
Xxxxxxxx, X. & X. Xxxxxx (2009). Monitoring convergence of temporal features in spontaneous dialogue speech. Dublin: UCD Working Papers.
Xxxxxx, R.M. & X. Xxxxxxxxxx (1966). Concurrent feedback, confirmation, and the encoding of referents in verbal communication. Journal of Personality and Social Psychology 4. 343-346.
Xxx, X.X. (1974). Prosodic Aids to Speech Recognition: IV. A General Strategy for Prosodically-Guided Speech Understanding. Univac Report PX10791, St. Xxxx, Minnesota: Xxxxxx Univac, DSD.
Xxxxxxx, X. (1977). Isochrony reconsidered. Journal of Phonetics 5. 253-263.
Xxxxxxx, X. & X. Xxxxxxxxxx (2011). Measuring acoustic-prosodic entrainment with respect to multiple levels and dimensions. In X. Xxxxxxxxxx, X. Xxxxxxx (eds.), Proceedings of Interspeech 2011, 3081-3084. Brisbane: International Speech Communications Association.
Xxxxx Xxxxx, X. (1940). Speech Signals in Telephony. London: Xxxxxx.
Local, J. (2003). Phonetics and talk-in-interaction. In M. J. Xxxx, X. Xxxxxxxx, & X. Xxxxxx (eds.), Proceedings of the 15th International Congress of Phonetic Sci- ences, 115-118. Barcelona: Casual.
Low, E.L. & X. Xxxxx (1995). Prosodic patterns in Singapore English. In X. Xxxxx and
P. Xxxxxxxxx (eds.), Proceedings of the 13th International Congress of Phonetic Sciences, vol. 3, 636-639. Stockholm.
Xxx, E.L., X. Xxxxx & X. Xxxxx (2000). Quantitative characterizations of speech rhythm: syllable-timing in Singapore English. Language & Speech 43(4). 377- 401.
McFarland, D.H. (2001). Respiratory markers of conversational interaction. Journal of Speech, Language and Hearing Research 44. 128-143.
Xxxxxx, M. (1990). On the rhythm parameter in phonology. In I.M. Roca (ed.), Logi- cal issues in language acquisition, 157-175. Dordrecht: Foris.
Xxxxxx, M. & I. Xxxxx (1986). Prosodic phonology. Dordrecht: Foris.
D. (1965). The perception of time intervals. (Progress Report 2), London: Phonetics Laboratory, University College.
Xxxxx, X.X. (2006). On phonetic convergence during conversational interaction. Jour- nal of the Acoustic Society of America 119. 2382-2393.
Xxxxx, X.X., X. Xxxxxxx, X. Xxxxxx & R.M. Xxxxxx (2012). Phonetic convergence in college roommates. Journal of Phonetics 40. 190-197.
Xxxxx, X.X. & X.X. Xxxxxxx (2003). An empirical comparison of rhythm in language and music. Cognition 87(1). X00-X00.
Xxxx, K.L. (1945). The intonation of American English. Xxx Xxxxx: University of Michigan Press.
Port, R. (2003). Meter and speech. Journal of Phonetics 31. 599-611.
Port, R., X. Xxxxxxx & X. Xxxxxx (1996). A Dynamic Approach to Rhythm in Lan- guage: Toward a Temporal Phonology. In X. Xxxx, & X. Xxxx (eds.), Proceedings of the Chicago Linguistics Society, 375-397. Chicago: Dept. of Linguistics, Uni- versity of Chicago.
Xxxxx, X. (1928). Morphology of the Folktale. Austin: University of Texas Press, 1958.
Xxxxx, X., X. Xxxxxx & X. Xxxxxx (1999). Correlates of linguistic rhythm in the speech signal. Cognition 72. 1-28.
Xxxx, X.X. (2010). Speech rhythm across turn transitions in cross-cultural talk-in-in- teraction. Journal of Pragmatics 42(4). 1037-1059.
- - lan-
guages. In D. Xxxxxxx (ed.), Linguistic controversies: essays in linguistic theory and practice in honour of F.R. Xxxxxx, 73-79. London: Xxxxxx Xxxxxx.
Xxxxx, X. & X. Xxxxx (2008). Isochrony reconsidered: Objectifying relations be- tween rhythm measures and speech tempo. In P. A. Xxxxxxx, X. Xxxxxxxxx & X. Xxxx (eds.), Proceedings of the Speech Prosody 2008 Conference, 419-422. Cam- pinas.
Xxxxx, X., X. Xxxxxxxxx & X. Xxxxxxxxx (1974). A simplest systematics for the organi- zation of turn-taking for conversation. Language 50. 696-735.
Xxxx, X. & G.G. Xxxxxxxx (1962). Isochronism in English. University of Buffalo Stud- ies in Linguistics - Occasional papers, 9. 1-36.
Xxxxxx, X.X. (1977). Preliminaries to a theory of action with reference to vision. In
R.E. Xxxx & X. Xxxxxxxxx (eds.), Perceiving, acting and knowing: Toward an ecological psychology), 000-000. Xxxxxxxxx, XX: Erlbaum.
Xxxxxx, X.X. (1971). Isochronous stresses in RP. In L.L. Xxxxxxxxx, X. Xxxxxxxx &
X. Xxxxxxx (eds.), Form and Substance, 205-210. Copenhagen: Akademisk For- lag.
Xxxx, X.X. & F. Wiolland (1982). Is French really syllable-timed? Journal of Phonet- ics 10. 193-216.
Wilson, X. & T.P. Wilson (2005). An oscillator model of the timing of turn-taking.
Psychonomic Bulletin & Review 12. 957-968.
Xxxxxx Xx Xxxxxxxxx
University of Tuscia (Viterbo) (Roma)
Italy