Deadline for registration expiring soon for our next Brussels workshop!
Most people dislike the sound of their voice when they hear it recorded. Even professional speakers, actors, or singers often flinch at playback. They say things like, "I sound weird," "That doesn’t feel like me," or even, "I can’t stand it." It’s a nearly universal experience, and it's not limited to amateurs.
It's worth briefly pointing out that some of this discomfort may come from the medium itself. If you dislike how your voice sounds on Zoom, Teams, or similar platforms, it's often because these systems apply heavy noise suppression, compression, and filtering. These algorithms alter your timbre and strip your voice of its finer details, removing and distorting the very overtones that give it depth and identity.
But the discomfort runs deeper than just bad audio processing. Even with perfect microphones and no filtering, many people still react strongly to the sound of their recorded voice. Why? Because the core issue isn’t just technical—it’s perceptual. It has to do with how our nervous system experiences effort, movement, and sound from the inside, and how that internal experience rarely matches what the outside world reflects back to us.
The most common answer is physical: when we speak, we hear ourselves both through air conduction (sound waves traveling through the air to our ears) and bone conduction (vibrations traveling through our skull and body). Bone conduction adds depth, resonance, and warmth. So when we hear a recording, we only get the air conduction part—thinner, more nasal, less "full."
A secondary explanation is emotional: our recorded voice confronts us with an external version of ourselves that we can’t control. We judge it harshly. It reveals our insecurities. So the discomfort, they say, is a mix of acoustics and self-consciousness.
Why That Isn’t Enough
These explanations are partially true, but they fall short.
The bone conduction theory also fails to explain why dancers, martial artists, and actors often feel deeply unsettled when watching silent recordings of themselves. They may experience a sharp disconnection—"That doesn’t look like me"—despite having no sound (and no bone conduction) involved at all. The discomfort is the same: a mismatch between internal sensation and external reality. This tells us that the problem isn’t specific to how we hear, but to how we perceive ourselves across all forms of expression. There is no sound involved—yet the feeling of "that’s not me" remains just as strong. The source of discomfort, then, can’t be purely acoustic. Nor does it fade predictably over time. Many people continue to dislike their recorded voice even after frequent exposure.
This suggests the issue is not just about hearing a different sound, but about a deeper perceptual disconnect.
A Sensorimotor Mismatch
The core of the issue is perceptual—but it’s important to say this clearly: this is not a flaw. What we’re describing here is the normal state for nearly everyone. It’s not a dysfunction, but a feature of how the nervous system prioritizes stability and coherence over precision. Modern life forces the nervous system into a kind of low-grade stress mode and we become attuned to 'thicker,' less refined sensory information—pressure, tension, force—while more delicate vibratory cues are suppressed. This state of mismatch isn’t an error or deficiency. It’s simply what most people experience. The nervous system is doing its job: keeping things manageable, prioritizing what feels urgent, even if that means muting the subtleties.
When we speak or move, we don’t just produce an output—we experience a whole set of internal signals. These include interoception (internal bodily sensations), proprioception (sense of movement and position), and effort (the muscular and emotional work we associate with the action). This creates a rich internal map of what we feel we are doing.
But when we hear or see ourselves from the outside, we encounter something very different: the actual trace of what we did in space and time. The nervous system is constantly smoothing, gating, and simplifying this map—often suppressing subtle vibratory feedback and highlighting muscular effort. This gating isn’t a problem to fix; it’s an adaptive response that helps the nervous system prioritize survival-relevant signals. When in stress mode, the system favors blunt, urgent cues like pressure or tension and filters out more refined vibratory feedback. This doesn’t just change what we feel—it also changes how we move and sound. Our gestures become less refined, our voices less vibrant and rich. We default to effort and density, because that’s what the nervous system is wired to amplify under conditions of chronic, low-intensity stress—the kind that shapes modern life more often than we realize. It's not that we're constantly facing life-or-death threats, but that our systems remain subtly on alert, prioritizing force over nuance, pressure over resonance. So when the internal sense meets the external record, the result is often dissonance. What you felt and what you did don’t align.
This mismatch is especially clear in the case of the voice.
A voice lacking spectral richness—particularly in the upper harmonics—will not radiate effectively into space. These higher-frequency components carry energy and presence, but they are easily dampened or lost if not abundant in the source. They don’t travel far; they dissipate quickly. So while you may hear them faintly in your own head, the microphone—and the listener—may not. The role of higher frequency in signaling proximity (and therefore intimacy) is often undervalued.
Meanwhile, low frequencies travel further, but they are only part of what your voice really is. And crucially, bodily pressure— in the chest and elsewhere— dampens high-frequency tissue vibration rather than amplify it. So while someone may feel strong chest involvement, the chest may in fact be acting as a vibration dampener, not a resonator.
This leads us to a major illusion: we often mistake what we feel for what we sound like.
When a speaker feels vocal power or presence, they’re often registering pressure in their body, not actual vibration in their voice. They feel muscular drive. But pressure doesn’t radiate. It doesn’t project into space. What projects is vibration, and vibration is often gated out of awareness by the stressed nervous system.
The key shift occurs when people begin to differentiate between the sensation of pressure and the sensation of vibration. Most people are attuned only to pressure; they associate effort with presence. They can almost hear their effort. But when the recorded voice lacks the expected richness, it’s because what was felt was effort—not resonance.
The estrangement comes from this very widespread sensory confusion. The voice feels full during speaking, but sounds flatter in playback. The reality is not that the recording is a distortion—but that the internal sense was biased toward effort over emission.
To understand why this sensory mismatch is so unsettling, we can turn to the Jungian concept of the shadow archetype. In Jungian psychology, the shadow represents the parts of ourselves we do not normally perceive or identify with—the aspects that are rejected, suppressed, or simply remain outside conscious awareness. These aren't necessarily negative traits—they're simply unknown or unfelt.
Applied to the voice, the shadow takes on a physiological dimension. This isn’t necessarily about trauma or repressed emotions—it’s about how the nervous system filters out parts of our embodied reality in order to maintain a functional self-image. Tensions, pressure and uncomfortable body feelings we have become used to get gated out. Over time, they become invisible to us. Parts of ourselves remain outside conscious awareness but keep making a sizable contribution to the way our system functions.
When we hear our recorded voice, we’re confronted with the aspects of ourselves that were filtered out: lower-than-necessary nervous system tone, stressed breathing, vocal constriction, spectral flatness.
The discomfort is not just psychological. It is sensorimotor estrangement—a confrontation with the parts of our vocal identity that never make it into awareness. Neuroscience-based sensory voice work, then, becomes a practice of shadow integration: not in metaphor, but in flesh.
By learning to sense what we normally overlook, to distinguish effort from resonance, and to inhabit more of our true vibratory field, we begin to reclaim the voice not just as sound—but as self.
Want to develop a voice you will like to hear recorded? I can help. Reach out, and let’s get started.
© Andrea Caniato, June 2025