Begin typing your search above and press return to search. Press Esc to cancel.

Measurements, Listening, and What Matters in Audio

An important part of every magazine editor’s job is to choose the writers who appear in the magazine, and to decide which articles to publish and which to reject. The collective result of these judgment calls over many years is an implicit statement of the magazine’s fundamental view of the subject it covers—the magazine’s heart and soul, if you will. For example, Consumer Reports evaluates automobiles from the viewpoint that cars are nothing more than transportation appliances, while Automobile reflects a passion for fine automobiles and the joy of driving. The respective editors of each magazine would surely decline to publish a review written by a writer on the other magazine’s staff; each writer’s approach is diametrically opposed to its counterpart’s editorial philosophy. The Automobile reader doesn’t care about the number of cupholders in a Buick minivan, and the Consumer Reports subscriber isn’t interested in reading a poetic description of rowing a Porsche’s manual six-speed gear box while driving over Italy’s Stelvio Pass.

My approach to editing The Absolute Sound is to publish a wide range of opinions and viewpoints. There’s no single truth that will take every listener to his or her own musical nirvana. Tubes, transistors, analog, digital, horns, planar speakers—the list goes on and on. Consequently, I don’t impose a Soviet-style orthodoxy to which writers must adhere. That would exclude legitimate viewpoints and make for a boring magazine. My only criterion is whether the writer’s piece is intellectually rigorous—up to a point. For example, I wouldn’t publish an article asserting that MP3 streaming from a phone via Bluetooth to wireless earbuds adequately serves the music, the artist, or the listener. Similarly, I wouldn’t publish a writer who claimed that all amplifiers sound the same.

Which brings me to an essay longtime TAS contributor Robert E. Greene sent to me for publication. I have enormous respect for Dr. Greene’s intellect (he’s an internationally recognized  professor of mathematics and author of graduate-level math textbooks, the titles of which I don’t begin to understand); his knowledge of loudspeaker design and testing is encyclopedic; and his understanding of music (he plays violin in an amateur orchestra) and the theoretical underpinnings of audio technology are about as deep and comprehensive as that of anyone I know.

And yet, I find that some of the views expressed in his essay are diametrically opposed not just to my perspective, but to the fundamental tenets on which The Absolute Sound was founded.  Those fundamental tenets are the primacy of the listening experience in judging reproduced sound, and the idea that any improvement in sound quality can be musically significant. Although we agree on the importance of the microphones, acoustics of the recording venue, and acoustics of the playback room (which is why I just built a dedicated listening room), Robert’s essay argues that measurements can fully explain all audio phenomena, and that because loudspeakers in rooms exhibit deviations from flat frequency response, any improvements in sources, amplification, cables, AC power, etc. aren’t worth pursuing. Although I welcome a diversity of opinion in these pages, there must be a line between such diversity and views that are anathema to the magazine’s core principles. Robert’s essay caused me to consider just where to draw that line.

I concluded that the right thing to do was to publish Robert’s essay, but to also offer a countervailing viewpoint. I therefore present Robert’s essay followed by an editorial I wrote that appeared in Issue 218, updated and expanded here. Robert Harley


The View from the Edge

Robert E. Greene

In the early part of November just past, I fell seriously ill. I have recovered completely, but I had an extended convalescence, being one month in the hospital and another month in rehabilitation facilities. Other people’s medical problems make tedious reading, so I won’t go into details, but the enforced inactivity gave me a long time with not much to do except think—about audio in particular. And it occurred to me that it might be of interest to others—what I came up with contemplating audio at extended leisure following return from what I can only describe, with minimal drama, as a brush with the possible end of life. One tends to think seriously after that and to reexamine the foundations of one’s beliefs.

Kenneth Clark, at the end of his television series Civilisation, ventured to summarize his personal view of all that had gone before in thirteen episodes recounting the development of western culture. He began his summary by saying, “At this point, I reveal myself in my true colors as a stick in the mud.”

This came to mind because in that same spirit, after decades in the audio industry—I started with TAS in the early 1980s—I have settled into believing a number of formerly quite conventional things, albeit with certain twists. Most fundamentally, I think that almost everything in audio can be explained by measurements, provided one does the correct measurements sufficiently carefully. In particular, I think a very great deal can be explained simply by frequency response and the closely related matter of phase response. (These are indeed closely related: In minimum phase devices, one determines the other.)

People sometimes fail to realize how much can be explained on this basis because they do not always recall—though TAS has told them often—how tiny the threshold is for audibility of response differences: 0.1dB changes can be audibly detected. Arithmetic shows that there are vast numbers of audibly distinguishable possibilities that would seem superficially quite close to “flat.”

Speakers may not need to be flat within ±0.1dB to sound “musical,” but they surely need to match with that precision to sound alike. There is a lot of room for variation in this, given that speakers are typically lucky to be ±1dB (no decimal point). 

Attached to this rather abstract business is a practical matter of my own experience: I have found that almost every speaker can be improved audibly by some judicious EQ. People are reluctant, it seems, to take this up, but I have found it to be true.

A second point is that the room/speaker interaction is really critical. No matter how good a speaker is anechoically, if it has a 5dB dip between 100 and 200Hz from floor interaction it is going to sound wrong. (Long experience with DSP room correction devices has shown me how often such a hole develops.) Moreover, control of room reflections in general is a crucial matter. After a visit some years ago to an RFZ studio room [a room in which the listening seat is positioned in a reflection-free zone—RH] designed by Ole Lund Christensen and Poul Ladegaard in Denmark I formulated in my own mind the slogan “acoustics is everything.” And to a surprising extent, this has been the case in my continuing experience.

If you get these things under control—really flat speakers in a room with which they interact correctly—it is quite startling how “good” things will sound. In particular, it is possible to get the timbre of the reproduction remarkably close to what is actually on the recording. For this, it helps to listen sitting close to the speakers and with early reflections minimized. And, of course, the speakers need to be well behaved otherwise, e.g., good suppression of audible cabinet resonances and so on.

What about space? For decades, it has become a fashionable matter to worry about “soundstage,” but this has reached the point that recordings are expected to have a soundstage almost independently of what the recording is—to expect the soundstage to be a property of the playback system rather than reproducing what is recorded. To my mind, this is a matter of using reflections off the walls, especially the first reflections off the sidewalls, to generate an artificial sense of space. People may like this but it is not really reproducing the recording. And the impression is very unstable in detail because it is not really there on the recording. (Some recordings actually have outside-the- speaker images because of phase effects from spaced microphones, but most recordings do not have this in any systematic way). Because of the instability, the idea has arisen that all kinds of things that really have no reason to be part of the reproduction of space at all can be evaluated according to their effect on soundstage, with enlargement being always regarded as better. In my view, this is not a good way to analyze audio playback.

My view is that the correctness of stereo is essentially completely embodied in the tight focus of images from mono signals. If this focus is complete, stereo is working. (Incidentally if this sounds crazy to you, I am not alone in this view. John Atkinson says the same thing explicitly in a recent interview with Steve Guttenberg.)

This idea of evaluating everything in terms of soundstage is potentially a major source of confusion. Since no one has any idea of what kind of soundstage ought to arise from most recordings, soundstage is not really a sensible criterion for evaluation of anything. Ironically, Harry Pearson, who popularized the soundstage idea initially, was firmly of the opinion that one should not use the sound off walls, and that the spatial impression that was really on the recording would be ideally correct listening out of doors. But this fundamental principle seems to have been lost.

Attached to the unstable soundstage matter is a general obsession with micro-effects, some of which may not even be detectable under blind conditions. Some of these tiny effects may be audible, but the important point is that there is seldom any mechanism for deciding if the changes are to the good or not. If there is no way to know why some change, of a power cord say, affected the sound, there is no way to decide whether the effect, if any, was positive or not. How could you tell? Believe the manufacturer? Believe reviewers, who have as little basis as you yourself? This is a major issue. Inferring from listening to recordings what is correct among possibilities that differ by very small amounts is a process fraught with peril.

My overwhelming experience personally is that if you get fundamentals right, all the tiny things will fade into insignificance. Tiny changes may remain audible, but they will not affect musical experience all that much. Back in Copenhagen, in Christensen’s and Ladegaard’s reflection-free-zone room, all the electronics sounded good. Various electronic devices did not become identical, but they all sounded good in any verifiable sense of the word. Electronics work; speakers in rooms usually do not work so well. But when the room and speaker thing works well, the electronic things fade in significance. When the big things are right, the small things count for little. It is as if, in practice, we worry about small things because we have not been able to get all the big things right.

It is interesting to ask one’s self how audio directed so much attention at things that make so little difference. The answer to that is quite easy: it is a basic principle of human perception that changes attract attention. If you start with an audio system that makes basic errors, you will to some extent get used to the errors. This is how people can listen to systems which are demonstrably inaccurate in basic ways. But then if you make a change, even a really small change, in the system, which is likely still making large and basic errors, the change will seem larger in significance than it really is, simply because it is a change. This is how people can end up worrying about tiny things about a system which is making big errors—especially if they do not have a standard of comparison, if they do not have a specific comparison standard in place, it is very easy to form an exaggerated idea of the importance of changes, even really small changes. The failure to observe this fact about human perception has led us to where all too many of us are now.

Concentrate on fundamentals; that would be my suggestion. And finally, never ever forget that the recording dominates. Remember forever what Peter McGrath said in the pages of TAS a few years ago, about how a cassette recording with a good microphone setup sounded far better than an ultra-high-resolution recording of the same event with a less fortunate mike setup. Understanding that the big things count most seems to me the beginning of audio wisdom. What makes audio bad is first of all acoustically bad recordings, not the medium but the microphone pickup—there are few really good ones—and second acoustically bad playback, speakers, and rooms. The rest has mostly turned out to be not worth worrying about by comparison in my experience. Acoustics—in the sense of microphones, speakers, and their room interaction—really is almost everything!


The Smaller Difference

Robert Harley

I’ve long been fascinated by the idea that if the sound is different, then signals are different. That is, if you hear a difference between, say, two aftermarket power cords, it follows that the electrical signal driving your loudspeakers must be different, which causes the loudspeaker cones to move slightly differently, creating a change in the patterns of vibrating air molecules striking your eardrums and thus in the electrical signals flowing through the auditory cortex. This change is interpreted by our brains as greater or lesser musical realism.

The concept is axiomatic, of course. But in the real world some of the differences in the musical waveforms traveling down loudspeaker cables, or the acoustic compressions and rarefactions reaching our eardrums, must not just be vanishingly small, but miniscule beyond our ken. These differences in the shapes of the musical waveforms are far too small to see or measure with even the most sophisticated technology, yet we as listeners not only routinely discriminate such differences, we sometimes find musical meaning in these differences.

This phenomenon is partly explained by the lack of a linear relationship between the objective magnitude of a distortion and the musical perception that distortion engenders. You might replace a cable and suddenly realize that, in a familiar recording, what you thought had been a guitar toward the back of the soundstage was actually two guitars. The difference in the electrical and acoustical signals produced by the different cables is infinitesimal, but the musical difference—one guitarist or two—is profound.

Concomitantly, you could introduce 2% second-harmonic distortion (a huge, easily measurable objective change) into an audio signal and perhaps not notice it, and if you did, the distortion would not be unpleasant, producing a warmer, plumper sound. Yet reconstruct an analog waveform from digital samples with a clock whose timing precision varies by just a hundred picoseconds (0.0000000001 seconds, or one one-tenth of a billionth of a second, the time it takes light to travel about an inch) and we hear the change in the analog waveform’s shape as a reduction in spaciousness, a hardening of timbre, a “glassy” character on high-frequency transients, a softening of the bass, and an overall reduction in listener involvement. Some of the distortions produced by an audio recording/reproduction chain don’t occur in nature and thus strike a discordant note when processed by our brains. Sounds produced by nature and by musical instruments virtually always have a significant second-harmonic component, but we never encounter in nature a waveform with the specific distortion introduced by digital jitter.

The great audio thinker Richard Heyser illustrated this idea in a treatise for an Audio Engineering Society workshop discussing the nature of distortion: “Now let’s consider this: the end product of audio is the listening experience. The end product is the result of perception, cognition, and valuation processes occurring in the mind. What things do we know about such processes? The answer is very little. But there are a few observable facts about this which, when considered for audio, give pause for redefining the concept of distortion. We know, for example, that words which are sung are perceived in a different manner than words which are spoken. Aphasia—the loss of ability to understand or speak words as a result of brain lesion—does not affect music. Where, in our audio technology, can we measure a waveform and distinguish its message as that of language or music? The brain does it. And can we be so presumptuous as to assume that the same measure of distortion which we use for one such waveform (which we cannot identify) must also apply to the other waveform? The left hemisphere of our brains and the right hemisphere play an incredibly complicated role in perception—a role completely ignored whenever we make a simple waveform analysis with audio test equipment.”

Humans seem to be hardwired to discriminate very small differences between similar things. Think of the widespread connoisseurship in any number of fields: wine, dog and cat shows, types of carnuba car wax, coffee, cheese—the list is endless.  Moreover, we don’t care about differences between coffee and tea, or between dogs and cats. We’re somewhat more interested in the differences between breeds of dogs, but some of us are absolutely obsessed with tiny variations within a specific breed. Meridian Audio founder Bob Stuart summed up this phenomenon with the phrase “the increasing importance of the smaller difference.”

Music is different from other forms of communication in that the meaning and expression are embodied in the physical sound itself. The vibrating air molecules striking our eardrums are not a representation of the music, but the music itself. Contrast music listening with reading type on a page (or pixels on a screen), in which the letters are merely symbols that stand in for the underlying meaning. Distort the type, or read in low light, and the meaning remains unchanged. But change the shape of a musical waveform and the composer’s or performer’s expression is diluted. You might not hear a subtle dynamic inflection, miss a crucial rhythmic interplay, or be oblivious to the way tone colors combine that would otherwise create an ineffable flood of emotion. The sound contains the meaning; it is not a representation of the meaning that can be divorced from the physical phenomenon conveying it.

All these observations point to the fallacy that technical measurement can replace the discrimination ability and auditory-processing power of our ear/brain system. Even if we could see the tiniest distortions in a musical waveform, this analysis would still remove from the process not just our hearing system, but more importantly our interpretation of how that distortion affects the communication of musical expression. Because music speaks to our humanity, a piece of test equipment, no matter how sophisticated, can never replace the experience of sitting down between a pair of loudspeakers.

By Robert Harley

My older brother Stephen introduced me to music when I was about 12 years old. Stephen was a prodigious musical talent (he went on to get a degree in Composition) who generously shared his records and passion for music with his little brother.

More articles from this editor

Read Next From Blog

See all

Michael Rabin And His Magic Bow

Michael Rabin (1936–1972) was one of America’s greatest violinists—so great […]


The Cat’s Grin

Jonathan Valin wrote the following essay in response to the […]


Maria Schneider’s Data Lords

Three summers ago, Grammy Award-winning composer-arranger-bandleader Maria Schneider premiered her […]


Q&A with Bill Schnee

Bill Schnee is a producer, Grammy Award-winning engineer, and author […]

Sign Up To Our Newsletter