Speech Production and Perception across
the Segment-Prosody Divide:
Data – Theory – Modelling
The concept of sound segments has traditionally played a central role in the
phonetic representation of words. It underlies the development of alphabetic writing
systems, of phonetic transcription and of phonemic theory. Other sound aspects,
especially pitch, but also energy, voice quality, rhythm have been conceptualized as
being superpositioned on segments in a broader frame of syllables and utterances. The
segment is associated with the short-time window of opening and closing movements
of the vocal tract, and simultaneously, with the differentiation of lexical and propositional
meaning, whereas prosodies are generally associated with long-time windows
of pitch, energy and voice quality control, and predominantly with attitudinal and
expressive utterance meaning, including the functions of attention seeking, intensity
signalling, and syntagmatic phrasing. This different substance-meaning duality
in sound segments and prosodies accounts for the current dichotomous mainstream
research paradigms of sounds and prosodies. The sound-prosody dichotomy has,
however, been repeatedly called into question, among others in the Firthian School
of Prosodic Analysis, following Firth’s seminal article ‘Sounds and Prosodies’ from
1948, for example in the study of such suprasegmental phenomena as vowel harmony
or long articulatory components of, e.g., palatalization, velarization, nasalization,
glottalization in the linguistic function of distinctively marking words and morphological
structures. Moreover, it has always been bridged in the analysis of lexical
stress, where segmental aspects of vowel duration and vowel spectrum, and prosodic
aspects of fundamental frequency and energy have jointly been taken into account in
a vast array of experimental investigations.
The reliance on linguistic form and phonetic substance in the analysis of sound
segments and prosodies reflects the tenets of 20th century structural linguistics, as it
relegates the functional aspect of speech communication to a post hoc level. Such a
dichotomous formal approach is a useful heuristics to come to grips with the enormous
complexity of speech, especially in the initial stages in the investigation of a language.
Yet, the formal manifestations, analytically separated as sounds and prosodies, are
the joint expression of the manifold communicative functions in speech: semantic,
information-structural, expressive and attitudinal. If these functions are taken as the
superordinate control variable, the axiomatic formal dichotomy of sounds and prosodies
fades away because they interact, with varying weights, in the coding of specific
communicative functions. This functional approach to phonetic detail in segmentprosody
interaction was the empirical and theoretical theme of two recent plenary
talks at the 17th ICPhS in Hong Kong: ‘Does Phonetic Detail Guide Situation-Specific
Speech Recognition?’ by Sarah Hawkins and ‘On the Interdependence of Sounds and
Prosodies in Communicative Functions’ by Klaus Kohler. They were preceded by papers
in Journal of the Acoustical Society of America, Journal of Phonetics, and Phonetica:
Niebuhr, O.: Coding of intonational meanings beyond F0: evidence from
utterance-fi nal /t/ aspiration in German. J. acoust. Soc. Am. 124: 1252–1263
Hawkins, S.: Roles and representations of systematic fi ne phonetic detail in
speech understanding. J. Phonet. 31: 373–405 (2003).
Local, J.: Variable domains and variable relevance: interpreting phonetic
exponents. J. Phonet. 31: 321–339 (2003).
Kohler K.: Communicative functions integrate segments in prosodies and
prosodies in segments. Phonetica 68: 26–56 (2011).
Kohler, K.; Niebuhr O.: On the role of articulatory prosodies in German message
decoding. Phonetica 68: 57–87 (2011).
On the one hand, these investigations showed systematic phonetic detail in talkin-
interaction as well as acoustic effects of segments on pitch patterns and of pitch
patterns on segments in the perceptual identification of semantic functions, and, on the
other hand, demonstrated the perceptual importance of long phonetic components of,
e.g., palatalization that are not linked to a segmentable sound unit but are superimposed
as an articulatory prosody on a wider stretch of speech.
We would like to make this sound-prosody relationship the theme of a special
issue of Phonetica and raise the central question:
How are sounds and prosodies intertwined, mutually shaping each other,
as a reflection of different communicative functions in speech interaction?
The papers we solicit are to take a renewed look in greater breadth and detail at this
interweaving of the threads of sounds and prosodies in a tapestry of speech communication
in a variety of languages, incorporating all forms of meaning – propositional,
attitudinal and expressive. The guiding principles for submissions are as follows:
• Papers present single-language or comparative analyses of new data in a variety of
languages that highlight the interdependence of short- and long-time windows of
speech production and/or perception in relation to specifi c communicative functions
they discuss aspects of the theory of segment-prosody interdependence based
on language-specifi c, typological or universal relations between communicative
function and phonetic substance
they attempt to model segment-prosody interaction in these function-substance
relations, for example in developing algorithms for contextually and situationally
adequate high-quality speech synthesis.
• Data can be either experimental or from corpora, unscripted ones in particular, and
experimentally collected items of speech need to be functionally and situationally
anchored, which rules out the widespread metalinguistic sentence frame of the
type ‘Say X again.’, commonly used in EMMA and EPG data acquisition.
• Potential topics may include:
– segment-prosody interdependence in talk-in-interaction,
– prosodic and segmental properties in the manifestation of speech functions,
for example different types of emphasis, in production and perception,
– contribution of vowel spectrum to lexical stress perception,
– spectral shaping of segments, for example fricatives and plosive releases, in
falling or rising f0 contours, and perceptual effects,
– articulatory prosodies in speech reduction, especially of function words, and
their importance in speech decoding,
– creation of rhythmic fl ow by preferred segmental patterns, such as high
versus low vowels, rather than the reverse, in fl ip-fl op, sing song, ping-pong,
zig zag, wishy washy, or avoidance of phrase-internal obstruent breaks in
sonorant stretches, as in thunder and lightning against the semantically
obvious *lightning and thunder, or mum and dad, German Mama und Papa,
Oma und Opa,
– signalling of tone and intonation in whispered speech
– contribution of segments and prosody to the generation of high-quality
speech synthesis and to (online) spoken word recognition.
Editorial Guidelines and Schedule
The total space available will be a double issue of the Journal. We expect to publish
approximately 12 contributions of 12 printed pages each on average. Submissions
need to follow the Phonetica style sheet (cf. ‘Instructions to Authors’ in any recent
issue and and should include Word and pdf
files. The dates of the editing schedule are as follows:
By 28 January, 2012: Submission by e-mail attachment to of
an 800-word abstract, giving title, author(s), affiliation(s),
e-mail address of main author.
29 February, 2012: Notification of authors whether the proposed papers have been
recommended as potential contributions to the theme by the
Editorial Team, and, if so, invitation to submit full versions for
By 31 May, 2012: Electronic submission of pdf files as e-mail attachments to, to be sent out for review.
31 July, 2012: Intimation of final decision about acceptance for publication in
the special issue, including reviewers’ comments and suggestions
for revision. Due to the tight publication schedule only
papers requiring minor or moderate revision can be included
in the special issue. If major revision is necessary, authors will
be encouraged to resubmit for publication in an ordinary issue
of Phonetica.
By 20 August, 2012: Submission of final versions in Word and pdf by e-mail attachment
End of 2012: Publication.

