The new psychoacoustic paradigm recognizes that human hearing didn’t evolve to hear tones and beeps. Rather, it is exquisitely tuned to detect the sounds of nature, which aren’t composed of frequencies and tones, but of transients of indeterminate and randomly varying frequency. In fact, many of the sounds of the natural world, an understanding of which confers important survival benefits, have no frequencies. Examples include crackling leaves, snapping twigs, the sounds of wind, rain, and running water. Our hearing system is highly adapted to operate against this background of natural, causal sounds. Not surprisingly, therefore, the latest neuroscientific research reveals that our hearing mechanism is considerably more dependent on, and sensitive to, timing cues than on frequencies. Moreover, this research has produced startling revelations that would be far beyond the ken of the early psychoacoustic researchers. For example, there are more neural pathways descending from the brain to the ear than from the ear to the brain. Why would our hearing system benefit from this two-way communication? Modern neuroscience and modeling reveals that the brain is constantly sending signals to the ear, modifying its response in real time, as we are perceiving the sound. As we listen, signals from the brain physically “tune” the ear to better encode the specific information it needs to more accurately determine exactly what is creating the sound and where it is coming from. These signals descending from the brain adjust both the cochlea and the ascending neural pathway, fine-tuning the auditory system’s so-called “grouping” and “feature extraction” abilities. The ear and brainstem response is constantly changing microsecond by microsecond. (This phenomenon, incidentally, is one reason why lossy codecs such as MP3 fail in practice despite working in theory; the masking model on which they are based views the ear as a passive device. It’s not nice to fool Mother Nature.) The implications of this discovery cannot be overstated. The fact that neurons change their coding in real-time to combine and extract features in the sound tells us that the system is highly non-linear. It also suggests that simplistic theories based on Fourier analysis—and the closely related sinc (cardinal sine function) sampling kernel on which digital audio’s low-pass “brickwall” filtering is based—must be viewed with caution.
Digital audio systems introduce specific errors, correlated with the signal, that smear sonic events in ways that never occur in nature, confusing the brain and reducing its ability to recognize and identify those sonic events. In the last decade, researchers have independently confirmed that we are much more sensitive to timing information and temporal microstructure than predicted by our 20kHz frequency limit. (See, for example, “Human Time-Frequency Acuity Beats the Fourier Uncertainty Principle” by J.M. Oppenheim and M.O. Magnasco, Physical Review Letters, 2013.) It’s this timing information that helps the brain perform the apparent miracle of converting neural impulses into the impression of hearing individual objects in space. Degrade that information, as digital audio encoding and decoding does, and you reduce the clarity with which we perceive objects in natural space. The recent psychoacoustic research also reveals that one group of neural pathways from the brain to the ear is dedicated to transmitting only reverberation information, which is a critical part of the natural world (and also of musical realism). So although we can’t hear test tones above 20kHz, we can detect the benefit of temporal microstructure within midrange frequencies right down to the microsecond level. The list of radical new discoveries goes on and on, revealing that our hearing mechanism is exquisitely more complex and sophisticated than previously believed. Yet the researchers at the cutting edge acknowledge that we’re still in the infancy of understanding how the neural pathways operate.
Stuart and Craven have studied this academic research and applied it to understanding the different ways in which distortions of temporal microstructure affect our ability to identify, segregate, and locate “external” objects—to experience a well-defined soundstage, in other words. To quote Stuart, “The more we stop interfering with the microstructure, with the stop and start of sonic elements, the easier it is for our brainstem to stream the necessary components for the perception of the viola, of the violin, of the piano, and the easier it is to ‘grasp’ the sound of the venue before the first note is played.” Indeed, in my comparisons of the same music in conventional digital and MQA (made from the same master), I can often instantly identify the MQA version as soon as I hear the room sound by its more realistic sense of space.
So, here we are in 2017, with our digital-audio systems designed around first-generation paradigms of information theory (Nyquist-Shannon) and psychoacoustics (frequency-based, the ear as a linear and static device). MQA comes along and forges a new path, building on the advances in other fields and developing from first principles an entirely new way of looking at the question of how best to encode, distribute, and decode digitally represented music. By focusing on the entire analog-to-analog chain, the result is a system that delivers sound quality better than that of the original high-bit-rate master recording (through correcting technical errors in the original A/D converter); is backward-compatible with all distribution platforms and playback hardware; offers assurance (via the MQA light on every DAC) that the bitstream being decoded by your DAC is identical to the bitstream created in the studio; and creates a file that is small enough to be streamed to everyone. It’s quite astounding that MQA can combine so many virtues, and solve so many problems, in a single stroke. It’s an audiophile’s dream come true.
Every scientific revolution begins when discoveries are made that aren’t explained by the existing paradigm. To cite one example of this in digital audio, high sample rates sound better than lower sample rates, even though the upper limit of human hearing is regarded as 20kHz. According to Nyquist-Shannon, the CD’s 44.1kHz sample rate can perfectly reconstruct the audio waveform all the way up to 20kHz. And according to first-paradigm psychoacoustics, information above 20kHz is irrelevant, and our temporal resolution is limited to that implied by that 20kHz upper-frequency limit. If this is the case, why would higher-sampling rates sound “better?” The answer is that the digital filters required by Nyquist-Shannon sampling at 44.1kHz introduce time distortion, or “temporal blur.” The filters for higher sampling rates are gentler and thus introduce less temporal blur. (Specifically, CD introduces around 5ms of temporal blur; 192kHz/24-bit PCM creates 300µs of blur; MQA aims for end-to-end analog blur as low as 10µs; MQA actually targets a system response similar to that of sound traveling a short distance in air.)
This, and many other anomalies that didn’t fit the existing paradigm of PCM digital audio theory and psychoacoustics have led us to Kuhn’s “crisis” phase of the revolution. The existing paradigms are showing their weaknesses, and new paradigms are emerging in which these anomalies are no longer anomalies, but fully consistent with, and explained by, the new paradigm. MQA is thus in the crosshairs of Kuhn’s “battle” between those who cling to the old paradigm and others who embrace the new. Early PCM audio (and DSD) will one day be regarded as primitive relics of the past, the product of first-paradigm thinking in audio engineering, information theory, and psychoacoustics. But as Kuhn demonstrates with example after example, it will be a long time before this revolution is fully complete.
Viewed in the context of Kuhn, it’s not surprising that MQA has its critics. MQA fits the definition of a paradigm shift; the ideas on which it is based are not advances within an existing framework of knowledge, but represent an entirely new framework. It’s the new framework that some people can’t comprehend, along with a reluctance to abandon long-held beliefs in certain “proven scientific facts.” But I suppose we should cut the critics some slack. After all, if Lord Kelvin could have been so wrong about the state of physics in 1900, it is easy to understand how a few audiophiles could be so mistaken about MQA.