Over the last half dozen sections of this chapter we have analyzed the constituents—the hits, slides, and rings—of events and language. Hits, slides, and rings may be the fundamental building blocks for human speech, but that alone doesn’t make speech sound natural. Just as natural contours can be combined in unnatural ways for vision, natural sound atoms can be combined unnaturally for audition. Language will not effectively harness our auditory system if speech combines plosives, fricatives, and sonorants in unnatural ways, like “yowoweelor” or “ptskf.” To find out whether speech sounds like nature, we need to understand how nature’s phonemes combine, and then see if language combines in the same way. For the rest of this chapter, we will look at successively larger combinations of sounds. But we turn first to the simplest combination.
Nature’s Syllables
My friend’s boy made a video of himself solving a Rubik’s Cube blindfolded, and then posted it on the Web. As I watched him put the blindfold on, pick up the cube, and begin twisting, I noticed something strange about the sound, but I couldn’t put my finger on what was unusual. Later, when I commented to my friend how his bright boy must owe it to inheritance, he replied, “Indeed, the apple doesn’t fall far from the tree. He faked it. The movie was in reverse.”
The world does not sound the same when run backward. What had raised my antennae when watching the Rubik’s Cube video was the unusual sounds that occur when one hears events in reverse. One of the first strange sounds occurred when he picked up the cube at the start of the video. Knowing now that it was shown in reverse, what appeared in the video to be him picking up the cube to begin unscrambling it was actually him setting the cube down after having scrambled it. Setting the cube down caused a hit and a ring, but in reverse what one hears is a ring coming out of nowhere, and ending with a sudden ring-stopping hit (the second voice of a hit, as discussed earlier in the section titled “Two-Hit Wonder”). That just doesn’t happen much in nature. When nature comes to the door, it knocks before ringing, not the other way around. Rings don’t start events. Rings are due to the periodic vibrations of objects, and objects do not typically ring without first being in physical contact with another object. Rings therefore do not typically occur without a hit or slide occurring first.
Hits, slides, and rings may be the principal fundamental building blocks for events, but rings are a different animal than hits and slides. Hits and slides involve objects in motion, physically interacting with other objects. Hits and slides are the backbone of the causal chain in an event. Rings, on the other hand, occur as a result of hits or slides, but don’t themselves cause more events. Rings are free riders, contributing nothing to the causality. Events do not have a ring followed by another ring. That’s impossible (although a single complex, or wiggly, ring is possible, as we discussed in an earlier section). And events never have an interaction (i.e., a hit or a slide) followed directly by another interaction without an intervening ring. Sometimes a ring will be inaudible, and so there will appear to be two interactions without an intervening ring, but physically there’s always an intervening ring, because objects that are involved in a physical interaction always vibrate to some extent. Events also always end with a ring, although whether it is audible is another matter.
The most basic way in which hits, slides, and rings combine is, then, this:
Interaction—Ring
where the interaction can be either a hit or a slide. If we let c stand for a hit or a slide (because “c” can be pronounced either as a plosive, “k,” or as a fricative, “s”), and a stand for a ring (which, recall, can sometimes be wiggly), then the fundamental structure of solid-object physical events is exemplified by caca. Not acac. Not cccaccca. Not accacc. And so on. Letting b stand for hits and s for slides, events take forms such as ba, sa, baba, saba, basaba, and so on. Not ab or sba or a or bbb or ssb or assb or the like. This interaction-ring combination is perhaps the most fundamental event regularity in nature, and is perhaps the most perceptually salient. Objects percussively interact via either a hit or slide, and give off a ring. Our auditory system—and probably that of most other mammals—is designed to expect nature’s phonemes to come in this interaction-ring form.
Given the fundamental status of interaction-ring combinations, if language harnesses the innate powers of our auditory system, then we expect language to be built out of vocalizations that sound like interaction-ring. Do languages have this feature? That is, do plosives and fricatives tend to be followed by sonorants? Yes. A plosive or fricative followed by a sonorant is, in fact, the most basic and most common phoneme combination across languages. It is the quintessential example of a syllable. Words across humankind tend to look approximately like ca, or caca, or cacaca, where c stands for a plosive or fricative, and a for one or more consecutive sonorants. All languages have syllables of this ca form. And many languages—such as Japanese—only allow syllables of this form.
Whereas interaction-ring is the most fundamental natural combination of event atoms, ring-interaction is a combination that is not possible. A ring followed by an interaction sounds out of this world, as in my friend’s son’s Rubik’s Cube video. We therefore expect that languages tend to avoid combinations like ac and acac. This is, in fact, the case. The rarest syllable type is of this ac form, and words starting with a sonorant and followed by a plosive or fricative are rare. In data I collected at RPI in 2008 with the help of undergraduate student Elizabeth Counterman and graduate student Kyle McDonald, about 80 percent of our sampled words (with three or fewer non-sonorants) across 18 widely varying languages begin with a plosive or a fricative. (See the legend of Figure 9 for a list of the sampled languages.) And a large proportion of the words starting with a sonorant start with a nasal, like “m” and “n,” the least sonorant-like of the sonorant consonants (nasals at word starts can have a fairly sudden start, and are more plosive-like than other sonorant consonants).
Note that a word starting with a vowel does not start with a sonorant, because when one speaks such a word, the utterance actually begins with something called a glottal plosive, produced via the sudden hitlike release of air at one’s voice box. To illustrate the glottal plosive, slowly say “packet,” and then slowly say “pack it.” When you say the latter, there can often be a sharp beginning to the “it,” something that will never occur before the “et” sound in “packet.” That sharp beginning is the glottal plosive. Words starting with sonorants are, thus, less common than one might at first suspect. Even words like “ear,” “I,” “owe,” and “owl,” then, are cases of plosives followed by sonorants, and agree with the common hit-ring (the most common kind of interaction-ring) structure of nature.