Sounds Wild and Broken, page 8
Today ocean waters are a tumult of engine noise, sonar, and seismic blasts. Sediments from human activities on land cloud the water. Industrial chemicals befuddle the sense of smell of aquatic animals. We are severing the sensory links that gave the world its animal diversity: whales cannot hear the echolocating pulses that locate their prey, breeding fish cannot find one another amid the noise and turbidity, and the social connections among crustaceans are weakened as their chemical messages and sonic thrums are lost in a haze of human pollution. Combined with overfishing and climate change, these assaults produce what biologists call the defaunation of the oceans: 90 percent declines among large fish, ongoing losses of marine mammals, calamitous reductions in coral reefs, and although data on many species are scarce, sharp population and range contractions among many other ocean-dwelling animals. The best current estimate is that about one-quarter of marine species face an imminent risk of extinction and many more are in decline.
Sound is one of animal life’s ancient creative processes. The title of Jacques Cousteau’s film Le Monde du Silence was a manifestation of our ignorance about aquatic sounds. It was also an unintended warning about the consequences of our actions for other species. As we get louder and more voracious, we silence other living voices, cutting back both the diversity and the evolutionary creativity of the oceans.
* * *
—
In the long view, we owe our human voices to milk. Specifically, the milk that ancient protomammalian mothers fed to their young. Before the evolution of lactation, protomammalian youngsters nourished themselves on whatever the environment supplied, sometimes brought to them by parents but often foraged for themselves. This diet of seeds, plant material, and small animal prey demanded guts able to digest complex and sometimes hard foods. Energy and nutrients were often in short supply, limiting the growth rate of the young. The invention of nutritive skin secretions broke these constraints and supercharged infancy. Mothers did the hard work of catching and digesting prey, then offered rich and easy-to-assimilate food. Nursing offspring connected directly to the strength and generosity of their mothers’ bodies. Although the earliest stages of the evolution of lactation are still unclear, studies of the DNA of modern animals show that by two hundred million years ago, female mammals possessed mammary glands and specialized milk proteins. In addition to changes in the physiology and behavior of mothers, this new method of feeding demanded a reworking of the throats of infants. Much later, these innovations would allow humans to speak. Our languages are bequests from these ancient mothers.
No reptile can suck. Their mouths, tongues, and throats are weak and lack skeletal support for complex muscles. This changed early in the evolution of mammals. The thin V-shaped hyoid bone in reptilian necks transformed into a stout four-fingered saddle. Muscles attached to these fingers, strengthening and stabilizing the tongue, mouth, larynx, and esophagus. Judging by fossil evidence, by 165 million years ago the mammalian hyoid and its muscles had turned the slack, open maw of reptiles into a powerful and coordinated sucking device.
The diversification of the mammalian clan was built on the unique nutritive bond between mothers and offspring, a connection made possible by both mammary glands and throat anatomy. To this day, young mammals are born with fully developed hyoid bones, even when other bones are still partly grown. Adult mammals, too, benefit, masticating and manipulating food with their mouths in ways impossible for reptiles.
Although the primary function of the hyoid is to support feeding, evolution has also put it to use in the shaping of sound. The larynx imparts sound to air flowing up the windpipe from the lungs. These sonic vibrations then stream into the upper part of the windpipe, the mouth, and the nasal cavities, before flying free to find listeners. The mammalian hyoid and its muscles allow animals to change the shape and resonance of throat and mouth, giving sound its timbre and nuance, squelching some frequencies and lifting others. The hyoid both supports the mouth and tongue and anchors the larynx.
When we call the knobby larynx in our throats the voice box, we do a disservice to the complex architecture within our upper throats and heads, places where voice finds its shape and character. Open your mouth wide, flatten your tongue, keep your head immobile, and then try to speak: most of your vocal capacity disappears. The mammalian vocal system, then, acts like many musical instruments. The larynx is the reed in an oboe. The upper vocal tract is the oboe’s body and finger keys.
Evolution has crafted many variations of the mammalian vocal tract, each suited to the ecological or social context of the species. In echolocating bats, part of the hyoid connects the larynx to a bony plate at the base of the middle ear. This connection allows the nervous system to compare the outgoing pulse of sound from the larynx with the returning echo in the ear. Toothed whales use their giant vocal folds to make whistles, but their echolocating pulses come from nasal air sacs below the blowhole. These whales feed not only by biting and grasping but by sucking large prey like squid out of the water then swallowing them whole. To support this predatory sucking, their hyoid bones are massive and have flattened surfaces for the attachment of muscles. Ultrasonic sounds in some rodents come from a larynx that directs a narrow stream of sound at a sharp ridge of tissue, somewhat like the air-to-edge sounds evoked in pipe organs or flutes. Among some roaring mammals—red deer, Mongolian gazelles, and lions and their relatives—deep sounds are achieved by lowering the larynx within the windpipe, elongating the vocal tract. This descent happens seasonally, dropping during the breeding season, and during the roar itself when the larynx falls then springs back up. The hyoid and its muscles and ligaments support this trombone-like slide. Because low sounds come from big bodies, the larynx’s movement presumably serves to make an impression on listeners, the equivalent perhaps of human motorcyclists modifying their exhaust pipes to give the sonic impression of large, powerful engines.
The vocal tracts of primates seem especially amenable to evolution’s creative powers. Compared with those of carnivores, for example, the larynges of primates are larger, have evolved faster, and are more variable in relation to body size. Many primates have large air sacs connected to the larynx that act as bellows and resonators. The most extreme of these modifications is among the howler monkeys, famed in the American tropics for their low, far-reaching roars and rumbles. In addition to large, paired air sacs on their necks, the howler monkey hyoid bone is expanded into a large air-sac containing cup that acts as an amplifier and broadcaster.
Strangely, we humans have no extraordinary elaborations of our vocal equipment. Our larynx and hyoid are about the size we would expect for an animal of our weight. Somehow, we’ve achieved the great complexity and nuance of spoken language with tweaks to basic mammalian gear. Losing the laryngeal sacs was likely a key early step. The bulbous sacs of our close cousins, the other great apes, are fabulous for making screams and moans that carry through the forest, but not so good for subtlety. We do not know why our ancestors lost these throat balloons. Perhaps early hominins benefited from quieter, more nuanced vocalizations or the sacs may have impeded them when they became bipedal runners and stalkers on the savannah. Whatever the reason, the loss of these encumbrances likely cleared the way for the neck and mouth to take on their modern human form.
Gently press your fingertip in the soft space under your chin, behind your lower jawbone. Now extend your chin a little and run your finger backward. At the junction of neck and jaw underside, your fingertip will find the front of the hyoid bone that wraps back into your neck. The ancestral mammalian four-fingered design of the bone remains, although two fingers dominate, giving it a horseshoe shape. This is the only bone in the body not attached to any other bone. Instead, it is suspended from the skull and jaw by strong straps of tissue. Keep moving your fingertip back and down. The next hard lump is the larynx, a thickened part of the windpipe. Inside, inaccessible to probing fingers, are the vocal folds. The larynx is suspended from the hyoid.
When we are born, the hyoid and larynx are pressed up against the back of the palate, as they are in many other mammals. As we grow, they both drop down. In adulthood, the hyoid sits just below the level of the lower jaw with the larynx suspended below, in the neck. The “Adam’s apple” visible in many men results from rapid growth of the larynx and its cartilage during male puberty, resulting in lower-pitched voices.
Sound waves from vocal folds in the human larynx flow upward into a vertical stretch of windpipe leading to the back of the mouth. From there, sound moves forward, from the back of the throat to the lips. Say aah into the mirror and you’ll see the horizontal space of the mouth take an abrupt downward turn behind your tonsils. Each space, throat and mouth, has its own resonance, adjustable through muscular action. The tongue is the ever-active mediator between these two resonant passages. No sound passes from one to the other without its involvement.
Articulate human speech starts with fine control of breath from the lungs. In the larynx, the vocal folds are drawn into the flow of breath and start vibrating, just as the mouth of a balloon vibrates when air rushes out. In most mammals, these folds are entrained in the flow of air, and their elasticity causes them to move back and forth, creating sound waves in the air. In the purr of a cat, these vibrations are boosted by rapidly pulsing muscles, but other mammals lack this enhancement. Sounds from the larynx then pass to the upper part of the throat and into the mouth. There the shape of airway and mouth enhances some frequencies and suppresses others. The tongue further filters the sounds as they flow into the mouth, where tongue, cheeks, jaw, and teeth also sculpt the sound. After departing the oral cavity, the lips impart plosive emphasis or hiss and, finally, the sound wings free into the air. Every part of this web of interacting muscles, bones, and soft tissues plays an essential role. Try speaking without air from the lungs, squirming tongue, or dancing lips. Impossible. The foundation stone for the whole edifice is the hyoid, the legacy of the first milk-producing mammalian mothers and their suckling offspring.
Attending to the differences between vowels and consonants gives us a sense of the importance of each part of the vocal tract. We hiss, spit, growl, and squeeze consonants from our mouths by constricting the air flow with our throat, lips, or teeth: shh, buh, grr, kah. Air flows freely from the larynx for vowels, shaped only by the tongue: eee, ooo, aaa. In each case, the larynx provides raw sounds that the mouth then sculpts. Khoomei singers, known in the West as Tuvan throat singers, take this to an extreme, using constrictions created by their tongues to filter out all but a few overtones while their tightened larynx drones. Theirs is a sophisticated vocal art that builds on the interplay of the larynx and mouth we all use as we speak or sing. The same is true for other mammals. When dogs or wolves throw back their heads to howl or squirrels lower their jaw and pull in their cheeks to chitter, they are shaping sound with their vocal tracts.
None of the structures that we use to speak are unique to our species. Our chests are more amply supplied with nerves for the fine control of breath than most primate species, but this is an elaboration, not an innovation. Our chimpanzee relatives also drop their hyoid bones and larynges. The descent is lower, though, in humans, opening a more voluminous resonant space in our throats. This, combined with protuberant faces of chimpanzees, means that the chimpanzee vocal tract is dominated by the mouth, with very little resonance in the throat. In humans, the resonant spaces in mouth and throat are about of equal size. Human and chimpanzee tongues are similar, although ours is more domed and larger relative to the size of our mouths. Anatomically, human speech is based on subtle changes in the proportions of structures present in other species. Contrast this with birdsong, which flows from a syrinx unique to modern birds. The evolution of both birdsong and human speech was a striking and novel expansion of the sonic diversity of the world. Theirs is the product of radical anatomical innovation, ours of tinkering.
Evolution used a heavier hand in our brains, creating new linkages that allow us to speak. These, too, build on talents and predispositions present in our close relatives. All great apes are keen learners. Infants take years to learn all they need in order to thrive within the social and ecological environment. This social transmission of behavior and tradition constitutes culture. But unlike in humans, the cultures of other great apes are founded almost entirely on close visual observation and tactile participation. Although other great apes are vocal, they do not, as far as we know, convey complex knowledge through sound. Our human ancestors connected vocal expression to culture. This union of two preexisting great ape abilities, vocalization and social learning, is the foundation of human language. We do not know exactly when this revolution took place. The hyoid bone was in modern form and position in ancestral humans, including Neanderthals, about five hundred thousand years ago. But there is nothing magical about the exact shape and position of this bone. Ancestors with higher hyoids and larynges might not have been quite as articulate as we are, but they had the anatomical capacity needed for complex sound making, just as other great apes do.
The conjunction of vocal production, learning, and culture left its mark in our brains and genes. Unlike in other primates, the nerves that control the larynx in humans thread directly into the “motor cortex,” the part of our brain that controls voluntary movements. These connections give us finer control and, most important, bring vocal production into the realm of learning. We also have substantial and complex brain connections among the laryngeal nerves and those involved in vocal interpretation, sonic memory, and the control of body movements involved in speech such as those in the tongue and face. The richness of these links seems at least partly controlled by a gene, FOXP2, whose sequence in humans diverges greatly from that in other primates. The gene acts as a regulatory hub, stimulating and suppressing the actions of other genes that guide the growth and interconnections of nerve cells that coordinate muscular action, sensory input, memory, and interpretation. Like the hyoid, the human form of the FOXP2 gene dates to at least five hundred thousand years ago and was shared with our relatives in the Homo genus, Neanderthals and Denisovans. Neanderthal ears were similar to those of modern humans. Reconstructions suggest that the middle and inner ears were, like ours, tuned to the frequencies of human speech. It is likely, then, that these now-extinct cousins could speak.
Brain networks, greatly elaborated in humans compared with other primates, allow humans to draw together vocal production, interpretation, and memory in ways that other species cannot. When we speak, we evince our human ability to comprehend: com “together,” prehendere “to grasp hold of.” Human speech is an achievement not only of tinkering, but of unification and interconnection. We are not alone in this talent. Many birds, and perhaps other vocal learners such as whales and bats, also have direct connections from the vocal organ to the motor portions of the brain, along with elaborated connections among regions of the brain concerned with memory, perception, analysis, and production of sound.
In reading these words, you take this human talent for integration one step further. Black-and-white glyphs are crystallizations of what was, until the invention of written language, ephemeral. Breath turned to ink. Vibrations in air frozen onto the page. Three hundred milliseconds after you gaze on a word, electrical energy courses through the visual cortex of the brain. Four hundred milliseconds after that, the auditory cortex fires, swiftly followed by brain regions that interpret sound and language. Within less than a second of attention to the written word, silent reading provokes a frenzy of activity in the “listening” portion of the brain. Silent reading thus opens us to apparitions, the ghosts of writers’ voices. Movements of fingers on keyboards and pens cast these sonic wraiths out of the body and onto the page.
As your eye moves over these clusters of letters, sound no longer travels through air but in waves of electrical activation along wet, fatty cell membranes in a mammalian brain. Now speak these words aloud. The wave leaps from flesh to air. Just as it always has, sound moves from one being to another, from one medium to another, connecting and transforming.
PART III
Evolution’s Creative Powers
Air, Water, Wood
Listen! In the animal sounds around us, we hear the diverse physicality of the world. The songs of birds contain within them the acoustic qualities of vegetation and the voices of the wind. Mammal calls reveal how predators and prey hear one another in the varied terrains of forests and plains. Water’s many moods are expressed in the forms of whale and fish songs. The inner structure of plant material is manifest in the vibratory signals of insects. Even the words on this page, voiced silently as you read, have living within them signatures of the air and vegetation in which human language blossomed.
I stand in a pine and spruce forest on the eastern slope of the Rocky Mountains of Colorado, on the upper reaches of North Boulder Creek where it descends from the continental divide. It is spring, but at this high elevation snow still covers the ground. All is quiet, except for the rich voice of a red crossbill. The bird’s song is a slender watercolor brush flitting across paper. Strokes of warm color dash and sweep on a smooth, open surface. Each note rings with extraordinary clarity in the snowy hush and still air.
I rummage in my waist pack for a sound recorder and microphone, the zipper and fabric sounding obnoxiously loud, then I hold still, pointing the microphone toward the tip of the ponderosa pine tree where the bird perches. For a few minutes, I rest in the presence of the song.

