Elsevier

Journal of Voice

Volume 22, Issue 5, September 2008, Pages 553-564
Journal of Voice

Vibratory Regime Classification of Infant Phonation

https://doi.org/10.1016/j.jvoice.2006.12.009Get rights and content

Summary

Infant phonation is highly variable in many respects, including the basic vibratory patterns by which the vocal tissues create acoustic signals. Previous studies have identified the regular occurrence of nonmodal phonation types in normal infant phonation. The glottis is like many oscillating systems that, because of nonlinear relationships among the elements, may vibrate in ways representing the deterministic patterns classified theoretically within the mathematical framework of nonlinear dynamics. The infant's preverbal vocal explorations present such a variety of phonations that it may be possible to find effectively all the classes of vibration predicted by nonlinear dynamic theory. The current report defines acoustic criteria for an important subset of such vibratory regimes, and demonstrates that analysts can be trained to reliably use these criteria for a classification that includes all instances of infant phonation in the recorded corpora. The method is thus internally comprehensive in the sense that all phonations are classified, but it is not exhaustive in the sense that all vocal qualities are thereby represented. Using the methods thus developed, this study also demonstrates that the distributions of these phonation types vary significantly across sessions of recording in the first year of life, suggesting developmental changes. The method of regime classification is thus capable of tracking changes that may be indicative of maturation of the mechanism, the learning of categories of phonatory control, and the possibly varying use of vocalizations across social contexts.

Introduction

One required aspect of spoken language development is an ability and preference for modal voice, as heard in typical adult speech. Oller1 has listed normal phonation as the first step toward mastery of canonical syllable production, commonly known as “babbling,” which occurs typically around the seventh month of life. Caregivers, researchers, and others primarily interested in tracking incipient language understandably attend to productions spoken with modal voice, as being indicative of emerging linguistic control, while treating squealy or growly voices as pertaining to more paralinguistic communication indicating emotion, attitude, or overall fitness. It may be that both modal and nonmodal voice uses are important in infants' development of vocal control. Consequently, there should be considerable explanatory value in categorizing phonatory patterns found in infancy, and in their developmental course leading to fine control of the laryngeal source mechanisms, as a key foundation for speech.

Studies of infant phonation per se have focused almost exclusively on crying as an indicator of health status.2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 Even in some of the earliest acoustic work in this area, spectrographic inspection of harmonic structure revealed categorically distinct nonmodal phonation types such as pitch breaks into loft register and sudden appearances of subharmonics.15 However, it was anticipated by such early researchers that these phonation types would serve as differential indicators of neurological or structural pathologies. It has meanwhile become apparent that these phonation types are prevalent in the cries of quite normally developing infants.16

Using spectrographic inspection, classic work by Stark in the 1970s identified subharmonic phonation as a regular feature of cry and discomfort vocalizations.17 Since then, other researchers have sporadically noted the regular presence and significant frequencies of many nonmodal vibration patterns in the comfort vocalizations of normally developing infants. Keating18 was among the first researchers to reveal the variety of nonmodal types in infant phonation (generally classifying “Fry” and “High” phonations but observing many features within these registers similar to those observed in the present work, such as pulse, subharmonics, and loft), and to interpret this variety in terms of laryngeal configurations. In the context of a comprehensive acoustic characterization of infant noncry vocalizations, Kent and Murray19 noted a high incidence of alternative phonations in 3-, 6-, and 9-month-old infants (including “biphonation” and “fry,” and again observing many examples of harmonic doubling). Somewhat later, Robb and Saxman20 inventoried occurrences of nonmodal phonation within a large sample of noncry vocalizations by young children aged 11–25 months (specifically, “biphonation,” “harmonic doubling,” and “fundamental frequency shifts”), and thereby established the normality of these phonations even in older children who were babbling and producing early words. A recent study by Rvachew et al focusing on methodological issues in the acoustic classification of infants' syllable inventories also acknowledged and included “abnormal” phonations (including a subset of the regimes explored here, such as “biphonation” and “harmonic doubling”), and found that they could be regularly and reliably identified.21

Concurrent with these developmentally oriented research efforts, theoretical work applying nonlinear dynamics to voice was also identifying vibratory regimes such as harmonic doubling and biphonation,22, 23 and one of the earliest such reports focused on newborn infant cries.24 As has been overviewed by several useful tutorial pieces,25, 26 nonlinear dynamics provides an organizing framework for vocal vibratory regimes. This is because the array of diverse vibration types can be mathematically understood to result from a single dynamic system, usually as a function of a small set of control parameters, such as subglottal pressure. As a result of variations in such a control parameter, the system can be observed to jump suddenly, or “bifurcate,” from one vibratory regime to another. Crossing of the phonatory threshold, by which the vocal folds held static by medial compression suddenly begin to vibrate when subglottal pressure is increased to a certain critical value, is perhaps the simplest example of such a bifurcation. Indeed, the very suddenness by which some vocal fold vibration types appear and disappear in phonation, as will be seen below in subharmonics for example, is a hallmark of nonlinear dynamic systems undergoing bifurcations. This discreteness is also a methodological boon to the demands of a classification scheme.

The concept of chaos as a vibratory regime offers another conceptual and methodological advantage to the understanding of vocal fold vibration via nonlinear dynamics; although appearing to be as noisy as purely stochastic turbulence, chaos can be treated as just another type of vibration that occurs within systems that are low dimensional, as governed by the small set of parameters comprising typical models of phonation. Although some authors may use the term “chaos” to refer to all the oscillatory possibilities of nonlinear dynamic systems, inclusive of periodic behaviors, we will reserve the term in our coding scheme described below to refer to only those behaviors that appear to be dominated by an aperiodic vibration of the vocal folds. See Jiang et al27 for a recent review of the usefulness of low-dimensional modeling and algorithmic measurement of chaos in the understanding of pathological phonation.

The central rationale of the current project is as follows: infants, via the wide variety of phonatory conditions experienced in the first year of development, may be manipulating a nonlinear dynamic system through its range of possible vibratory regimes. The mathematical theory of nonlinear dynamics specifies that these vibratory regimes can be unified and classified under a single framework. It should therefore be possible to inspect, discretely bound, and classify all infant phonations under this framework, selecting among a reasonably small set of possible regimes.

It may also be possible to apply signal-processing algorithms to the measurement of at least some such regimes.27, 28 However, these algorithms assume certain signal conditions that are not likely to be satisfied by freely produced infant vocalizations that often also include nonphonatory sources, articulations, modulations, etc. For such conditions, the most valid and reliable classifications may be achieved through auditory and visual inspection by human analysts. A specific proof of concept is thereby motivated: Can analysts be trained to exhaustively analyze a continuous record of the phonation occurring within infants' spontaneous vocalizations into a small set of vibratory regimes that can be identified with the possibilities expected under the theory of nonlinear dynamics? The purpose of this report is to demonstrate just this possibility, and to furthermore demonstrate that the resulting classifications can help to document infants' developing phonatory control.

The objective of applying regime analysis to infant phonation, however, is motivated not directly by the theory of nonlinear dynamics, but by the goal of understanding phonatory and speech development in infants. The analysis of regimes described in this report is not oriented therefore to the detection of every theoretically discernable vibration type, but rather to the classification of vocal behaviors appearing to have interpretive significance for a developmental analysis. Only those bifurcations that could be efficiently and meaningfully tracked were therefore targeted in the training and analysis protocols reported below. In particular, sudden pitch breaks or shifts clearly fit the paradigm of nonlinear dynamics as bifurcations, but the regimes that are delineated by such shifts may all be modal in perceived quality and apparent dynamics. Although these regimes might therefore be considered theoretically distinct because a dynamic break occurs between them, the regimes themselves do not necessarily carry the same interpretive significance as a break from, say, modal to loft.

By the same token, it should be acknowledged that a nonlinear dynamic classification scheme by no means encompasses all vocal qualities of interest. Many variations in quality may occur within a class, most notably within the infant modal voice in which varying degrees of breathiness, harshness, pressed, and other qualities occur. These were not targeted as distinct classes from the current perspective, but may be revisited using other tools such as perturbation analysis, estimates of glottal turbulence noise,29 relative harmonic amplitudes,30 and other tools oriented to general voice quality analysis.31

The research on these phonation types may also contribute to explication of the widely reported tendency of parents to recognize “categories” of vocalization produced by infants during the first half year of life, categories that are described impressionistically in a way that suggests that phonation type is the primary factor determining the categorization.32, 33, 34 In particular, many observers have indicated at least three widely recognizable categories occurring in most infants: a mid pitch category (vowel-like sounds, full vowels, or quasi vowels), a high-pitch category (squeals or squeaks), and a category that can be either low in pitch or mid in pitch with very harsh vocal quality (growls).35, 36, 37 These apparent categories are recognized by their repetitive and systematically alternating occurrence within sessions of recording.38 Observers appear to assign the vocalizations to squeal, vowel, or growl categories based on some “predominant” or “most salient” characteristic of the utterance.

The importance of these apparent categories of vocalization in the first months of life has been argued to be very fundamental, because they appear to represent the first contrastive vocal categories that are created by the infant. This ability to form new vocal categories, unknown in other primates, has been argued to form a necessary basis for the creation and learning of further contrastive vocal categories required for speech.39 The systematic study of vocal regimes and their perception is, we reason, the appropriate method to begin to unravel the nature of these categories, including how they are physically composed and how they are perceived.

The list of distinct regimes that were targeted in this application is presented in Table 1, and the distinguishing characteristics of each regime are presented in the following Method section. After developing basic methods and operational characteristics for regime classification, the report describes aspects of the protocols by which analysts were trained to perform classifications. Two types of results are then presented: (1) aspects of the reliability with which analysts performed classification on utterance sets selected to represent all types are examined; and (2) an application of the classification to all the phonatory events in recordings from a female infant at three different developmental stages in the first year of life is presented to examine the apparent developmental significance of the regime classifications.

Section snippets

Materials

Materials for training and reliability of regime coding were selected from three female infants aged between 4 and 11 months. These were normally developing infants recruited for a study of spontaneous vocalizations across varying social contexts, and no laryngeal or other clinical examinations were included in this protocol. Materials for the developmental analysis of regimes comprised all the nondistress vocalization from three 20-minute sessions recorded from one of these female infants at

Reliability of regime location and identification

Classification of infant phonation episodes for vibratory regimes actually involves three conceptually distinct decisions: (1) determining whether the sound should be considered to involve glottal behavior, or “intended phonation,” eligible for vibratory regime classification, (2) deciding on the classification, and (3) marking the onset and offset times of a distinct class. Each of these analysis decisions is evaluated separately in the following subsections. Preparatory to performing the

Discussion

In summary, this manuscript has demonstrated that infant phonations can be exhaustively classified by vibratory regime types. The specification of those types in accord with the framework of nonlinear dynamics provides additional external validation that the typology may ultimately be associated with physical mechanisms of production. It was also demonstrated, in at least one normally developing infant, that the regime classifications may be developmentally significant, with the somewhat

Acknowledgments

This work has been supported by grants from the National Institutes of Deafness and other Communication Disorders (R01DC006099 to D. K. Oller PI and Eugene H. Buder Co-PI). The authors are also grateful to the caregivers of our participants for their time and generosity, to the graduate student research assistants who contributed to this study as analysts, and to Jamie L. Edrington for her assistance with training and reliability assessments.

References (48)

  • R.G. Barr et al.

    Crying patterns in preterm infants

    Dev Med Child Neurol

    (1996)
  • A.M. Goberman et al.

    Acoustic examination of preterm and full-term infant cries: the long-time average spectrum

    J Speech Lang Hear Res

    (1999)
  • E.L. Grauel et al.

    Jitter-index of the fundamental frequency of infant cry as a possible diagnostic tool to predict future developmental problems: part 2: clinical considerations

    Early Child Dev Care

    (1990)
  • J.R. Irwin

    The cry as a multiply specified signal of distress

    Dissertation Abstracts International: Section B: The Sciences and Engineering

    (1999)
  • B.M. Lester et al.

    A biobehavioral perspective on crying in early infancy

  • C.J. Thodén et al.

    Sound spectrographic cry analysis of pain cry in prematures

  • O. Wasz-Hockert et al.

    The identification of some specific meanings in the newborn and infant vocalization

    Experientia

    (1964)
  • P.H. Wolff

    The natural history of crying and other vocalizations in early infancy

  • P.S. Zeskind et al.

    Acoustic characteristics of naturally occurring cries of infants with “colic.”

    Child Dev

    (1997)
  • M.P. Robb et al.

    An acoustic template of newborn infant crying

    Folia Phoniatr Logop

    (1997)
  • P. Sirvio et al.

    Sound spectrographic cry analysis of normal and abnormal newborn infants

    Folia Phoniatr Logop

    (1976)
  • M.P. Robb

    Bifurcations and chaos in the cries of full-term and preterm infants

    Folia Phoniatr Logop

    (2003)
  • R.E. Stark et al.

    Features of infant sounds: the first eight weeks of life

    J Child Lang

    (1975)
  • P. Keating

    Patterns of fundamental frequency and vocal registers

  • Cited by (39)

    • The maturational gradient of infant vocalizations: Developmental stages and functional modules

      2022, Infant Behavior and Development
      Citation Excerpt :

      Phonation exhibits different developmental patterns between cry and babbling, with the former showing an increased fundamental frequency and the latter, a decreasing value (Rothgänger, 2003). In addition, there are frequent occurrences of nonmodal features such as frequency shifts, harmonic doubling, interruptions, biphonation, and chaos (Buder, Chorna, Oller & Robinson, 2008; Kent & Murray, 1982; Robb et al., 2020). These instabilities are due in part to ongoing structural changes in the laryngeal tissues, but the mechanisms have not been adequately studied.

    • The origin of language and relative roles of voice and gesture in early communication development

      2021, Infant Behavior and Development
      Citation Excerpt :

      Selection and zooming afforded close-up views of the infant face, torso, and actions on one of the selected channels and a broader view of the interaction (including the parent) on the other channel. Details regarding laboratory equipment and procedures can be found in previous work from this laboratory (Buder et al., 2008; Oller et al., 2013). Instructions to parents for the interactive segments emphasized playing with and interacting with infants in a natural way, allowing for vocal, gestural, and tactile interaction at any time (additional information on recordings in SM, 2.1).

    • Development and validation of the protocol for the evaluation of voice in patients with hearing impairment (PEV-SHI)

      2020, Brazilian Journal of Otorhinolaryngology
      Citation Excerpt :

      For example, many of the references to nasality in deaf speech may refer not only to the actual feature of nasal resonance, but misarticulation of nasals, lack of oral/nasal distinctions, pitch variation, or any combination of these parameters.14 These perceived characteristics can be justified by the lack of auditory monitoring of the voice, causing difficulty in developing phonatory control and abilities to regulate and vary the voice use in different situations.3,15 Therefore, in addition to social, educational, and language limitations, hearing impairment can cause specific deviation of the communication related to speech and voice, interfering with intelligibility and crucially compromising the social integration of the individual,3 so it is important that the assessment of voice production cover all of these elements.

    • Registers in Infant Phonation

      2019, Journal of Voice
      Citation Excerpt :

      All noncry, nonlaugh, and nonvegetative protophone vocalizations30,33 meeting minimal audibility and duration criteria (>50 milliseconds) were coded in breath group units.37 These vocalizations were subsequently coded into intervals (referred to below as “segments”) representing the following mutually exclusive and exhaustive phonatory regimes6: modal (clear, parallel, and moderately spaced harmonics), high modal (more widely spaced harmonics or an audible falsetto quality), pulse (very closely spaced harmonics, widely spaced glottal pulses, and a “zipper-like” sound), subharmonics (lower amplitude harmonics appearing in between main harmonics), biphonation (two different sets of harmonics moving in nonparallel directions), closed stops (within-vocalization adduction-caused gaps in phonation), open stops (within-vocalization abduction-caused gaps in phonation), and chaos (very unclear harmonic structure with aperiodicity in glottal pulses); see Buder et al6 for more detailed definitions and examples. Of special interest in the current investigation, the high modal code was a stand-in for a possible loft register: coders were trained only to mark very high-pitched intervals or intervals in which the thin and weak quality of a nonmodal “falsetto” voice was salient.

    • Noisy but effective: Crying across the first 3 months of life

      2015, Journal of Voice
      Citation Excerpt :

      The high occurrence of these features during the first month of life led to the suggestion that SH and N may undergo developmental change with a subsequent diminishing occurrence as a function of vocal maturation. More recent research by Buder et al25 would seem to support this suggestion, albeit for non-cry vocalizations. These researchers labeled a range of nondistress phonatory behaviors exhibited by three children aged between 4 and 11 months.

    View all citing articles on Scopus
    View full text