These are important and obviously difficult issues, and it’s unfortunate that the popular press often oversimplifies them by just asking questions like, Is language mainly innate or mainly acquired? Or similarly, Is IQ determined mainly by one’s genes or mainly by one’s environment? When two processes interact linearly, in ways that can be tracked with arithmetic, such questions can be meaningful. You can ask, for instance, “How much of our profits came from investments and how much from sales?” But if the relationships are complex and nonlinear—as they are for any mental attribute, be it language, IQ, or creativity—the question should be not, Which contributes more? but rather, How do they interact to create the final product? Asking whether language is mainly nurture is as silly as asking whether the saltiness of table salt comes mainly from chlorine or mainly from sodium.
The late biologist Peter Medawar provides a compelling analogy to illustrate the fallacy. An inherited disorder called phenylketonuria (PKU) is caused by a rarely occurring abnormal gene that results in a failure to metabolize the amino acid phenylalanine in the body. As the amino acid starts accumulating in the child’s brain, he becomes profoundly retarded. The cure is simple. If you diagnose it early enough, all you do is withhold phenylalanine-containing foods from the diet and the child grows up with an entirely normal IQ.
Now imagine two boundary conditions. Assume there is a planet where the gene is uncommon and phenylalanine is everywhere, like oxygen or water, and is indispensable for life. On this planet, retardation caused by PKU, and therefore variance in IQ in the population, would be entirely attributable to the PKU gene. Here you would be justified in saying that retardation was a genetic disorder or that IQ was inherited. Now consider another planet in which the converse is true: Everyone has the PKU gene but phenylalanine is rare. On this planet you would say that PKU is an environmental disorder caused by a poison called phenylalanine, and most of the variance in IQ is caused by the environment. This example shows that when the interaction between two variables is labyrinthine it is meaningless to ascribe percentage values to the contribution made by either. And if this is true for just one gene interacting with one environmental variable, the argument must hold with even greater force for something as complex and multifactorial as human intelligence, since genes interact not only with the environment but with each other.
Ironically, the IQ evangelists (such as Arthur Jensen, William Shockley, Richard Herrnstein, and Charles Murray) use the heritability of IQ itself (sometimes called “general intelligence” or “little g”) to argue that intelligence is a single measurable trait. This would be roughly analogous to saying that general health is one thing just because life span has a strong heritable component that can be expressed as a single number—age! No medical student who believed in “general health” as a monolithic entity would get very far in medical school or be allowed to become a physician—and rightly so—and yet whole careers in psychology and political movements have been built on the equally absurd belief in single measurable general intelligence. Their contributions have little more than shock value.
Returning to language, it should now be obvious which side of the fence I am on: neither. I straddle it proudly. Hence this chapter is not really about how language evolved—though I have been using that phrasing as shorthand—but how language competence, or the ability to acquire language so quickly, evolved. This competence is controlled by genes that were selected for by the evolutionary process. Our questions in the rest of this chapter are, Why were these genes selected, and how did this highly sophisticated competence evolve? Is it modular? How did it all get started? And how did we make the evolutionary transition from the grunts and howls of our apelike ancestors to the transcendent lyricism of Shakespeare?
RECALL THE SIMPLE bouba-kiki experiment. Could it hold the key to understanding how the first words evolved among a band of ancestral hominins in the African savanna between one and two hundred thousand years ago? Since words for the same object are often utterly different in different languages, one is tempted to think that the words chosen for particular objects are entirely arbitrary. This in fact is the standard view among linguists. Now, maybe one night the first band of ancestral hominins just sat around the tribal fire and said,
“Okay, let’s all call this thing a bird. Now let’s all say it together, biiirrrrddddd. Okay let’s repeat again, birrrrrrrdddddd.”
This story is downright silly, of course. But if it’s not how an initial lexicon was constructed, how did it happen? The answer comes from our bouba-kiki experiment, which clearly shows that there is a built-in, nonarbitrary correspondence between the visual shape of an object and the sound (or at least, the kind of sound) that might be its “partner.” This preexisting bias may be hardwired. This bias may have been very small, but it may have been sufficient to get the process started. This idea sounds very much like the now discredited “onomatopoeic theory” of language origins, but it isn’t. “Onomatopoeia” refers to words that are based on an imitation of a sound—for example, “thump” and “cluck” to refer to certain sounds, or how a child might call a cat a “meow-meow.” The onomatopoeic theory posited that sounds associated with an object become shorthand to refer to the objects themselves. But the theory I favor, the synesthetic theory, is different. The rounded visual shape of the bouba doesn’t make a rounded sound, or indeed any sound at all. Instead, its visual profile resembles the profile of the undulating sound at an abstract level. The onomatopoeic theory held that the link between word and sound was arbitrary and merely occurred through repeated association. The synesthetic theory says the link is nonarbitrary and grounded in a true resemblance of the two in a more abstract mental space.
What’s the evidence for this? The anthropologist Brent Berlin has pointed out that the Huambisa tribe of northern Peru have over thirty different names for thirty bird species in their jungle and an equal number of fish names for different Amazonian fishes. If you were to jumble up these sixty names and give them to someone from a completely different sociolinguistic background—say, a Chinese peasant—and ask him to classify the names into two groups, one for birds, one for fish, you would find that, astonishingly, he succeeds in this task well above chance level even though his language doesn’t bear the slightest shred of resemblance to the South American one. I would argue that this is a manifestation of the bouba-kiki effect, in other words, of sound-shape translation.1
But this is only a small part of the story. In Chapter 4, I introduced some ideas about the contribution mirror neurons may have made to the evolution of language. Now, in the remainder of this chapter, we can look at the matter more deeply. To understand the next part, let’s return to Broca’s area in the frontal cortex. This area contains maps, or motor programs, that send signals down to the various muscles of the tongue, lips, palate, and larynx to orchestrate speech. Not coincidentally, this region is also rich in mirror neurons, providing an interface between the oral actions for sounds, listening to sounds, and (least important) watching lip movements.