Technicals of MQA
Technicals of MQA:
A
gentle reminder that presuming a partial map matches the territory is a
recipe for insidious disaster, and designing any system, audio or
otherwise, to that partial map is likewise a recipe for insidious
disaster.
We
may be made of stardust, but we exist in time and space. Frequency is a
convenient tool at times, but is an abstract space, a partial map that
only translates to the physical world in linear systems. Any element of
a system that is non linear, and the time/frequency presumption falls
part. Our ears, being highly non-linear, can discriminate timing of
events a full order of magnitude better in the real, physical world of
time and space, than frequency analysis would predict.
https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.110.044301
That
is the dirty little secret: nyquist PCM digital audio was designed
primarily in the frequency domain, with time domain errors being swept
under the rug. We are into second and third generations of listeners
raised on such artifacts, and we either voice our music systems to
partially compensate for the inherent time smear, or adapt to such
smeared sonics as the new normal. MQA elegantly and completely solves
that.
One "thing" that low bit up and oversampling schemes,
based on frequency analysis miss is: single event fidelity. They are
all averaging phenomena, with the inevitable tradeoff between steady
state and transient behaviour. MQA, designed with the time domain in
mind, has a unique way of coding and decoding single event transients,
with low time smear (or blur) and sufficient time resolution to be
truly transparent. Aperture uncertainty low enough to be similar to a
couple meters of air. No other system can do that.
~~~~~~~
Let's
clear one mis-conception up.
Confusing data with information, and
referring to MQA a "lossy" or "compressed" format is naive at best, and
often deliberately disingenuous. The question is more appropriately: is
any format audibly transparent?
As much as lossy/lossless
appears to be a concrete left brain term with accompanying clear
definition; ironically in the real world, it is a nebulous, abstract
term that evokes strong, irrational emotional reactions with
accompanying endless circular arguments, frequently over misunderstood
or deliberately twisted semantics.
With
MQA, the PCM baseband is backwards compatible (which is categorically
bit for bit lossless, as backwards compatible within existing PCM
standards); in addition the ultrasonic information is coded and buried
in the dither when folded to baseband. Unlike MP3, AAC, DTS, the actual
lossy formats which greatly increase noise modulation while discarding
audio information to save bandwidth and file size, MQA keeps noise
modulation to a minimum and does not throw away audio information.
Repeat: MQA does NOT throw away any audio information.
Information
versus data: What is perceptually significant in the ultrasonic
spectrum is not a crude stair-step approximation with time smear or
blurring from gibbs phenomena ringing, which traditional nyquist PCM
including so called hi-res gives when close to the nyquist frequency;
rather what is important, the perceptually significant information, is
the slope and timing of the small "squiggles" in that ultrasonic
spectrum, that correlate with larger signals in the (consciously)
audible baseband spectrum.
With MQA, this ultrasonic information
is encoded, buried, "folded" (neither lossy nor compressed), in the
backwards compatible baseband PCM as sonically benign dither using
spread spectrum techniques to keep noise modulation to an absolute
minimum.
~~~~~~~
There is no "mystery"
about the process, nor about the neuroscience behind perceptually
significant phenomena that MQA is based upon. There are a number of AES
papers describing the research.
MQA could produce de-blurred DXD
files, at 352.8 or 384 Khz sampling rate, and stop prior to the
pre folding stage, that would be both backwards compatible with
"hi-res" playback, and recognized as MQA when played back. Those would
keep the bit prefect crowd happy, at a cost of roughly an order of
magnitude larger file size. However, they are audibly indistinguishable
from, and measurably equivalent to folded and decoded MQA files at
44.1/48 Khz. Less than 4 microseconds aperture uncertainty, and no
ringing or time smear, same as 2 meters of air. So, stop complaining,
guys.
Again, MQA leaves the baseband redbook/CD PCM data intact,
backwards compatible and sonically improved when played back on legacy
equipment; and efficiently encodes perceptually significant information
from the ultrasonic octaves: the timing and slope of low level signals
that while not consciously audible, are in practice viscerally sensed
as a more real, more transparent emotionally connected experience.
There is more to it than just throwing more bits at it; there is more than just frequency response.
Link to a series of AES papers from 2019
~~~~~~~
How in the world does one bury information in the dither, which looks like noise? Cool trick.
Pseudo
random noise routines, straightforward on software, can be done in
hardware with a bunch of xor gates, super easy with PECL/ECL xor gates.
So a signal, that looks to the outside analog world like random noise,
actually has a pattern if one knows what to look for, the secret
handshake, to then be unlocked and decoded. It is a highly efficient
mode of coding information. Spread spectrum modulation was at one time
primarily a military phenomena, and is now widely used in telecom, CDMA
cell networks. Also, HDCD used a similar trick to bury a control
channel in the LSB dither. At the time of the HDCD patent application,
the examiner considered such a claim as highly unlikely, some kind of a
perpetual motion scheme, and requested an in person demo, which KOJ and
Pflash provided. This IP is one of the reasons Microsoft bought the
company. I will refrain from confirming or denying what and how MQA
does a similar trick when doing the origami fold into the noise floor
with dither. However, that magic, and the lack of understanding with an
accompanying reaction of incredulity, is likely one of the factors
fanning the denial of detractors, of which there seem to be some
otherwise smart individuals, who when "arguing" are clearly lacking an
understanding of what they are denying. Maybe Bob and crew could make a
neat little demo box with blinking LEDs to show that the process does
indeed do what is claimed.
~~~~~~~
PCM up/over sampling (typically
44.1/48 kHz to 8x that) just means more stairsteps defining a filter
function of the original base sampling frequency. This finer texture is
supposedly better, and can lead to incremental improvements. Unfolding
which MQA does is entirely different; MQA also includes intra sample
timing information to give a time resolution in the single digit
microsecond range, which traditional up/oversampling does not.
If
one ignores time smear, and just looks at the frequency domain, basic
digital theory stipulates that the encode and decode filters must match
for full resolution to be reproduced within the nyquist band. In the
real world, they rarely if ever do. That is a yuuuuuge issue with
digital audio, there is no standard for conjugate matching filters,
except for MQA.
Conjugate encode/decode filters are a basic
presumption in a nyquist/shannon/fourier digital sampling system: the
goes-intos need to match the goes-outas. The gibbs phenomena ringing on
transients of brickwall filtering will rarely if ever match between
encode and decode, as there is no standard! Early digital recorders had
analog minimum phase encode filters, with post event ringing only.
Later generations incorporated digital encode filters, known as linear
phase, with both pre and post event ringing. There is, again, no format
standard for those filters, other than a big fuzzy whatever. If the
encode/decode filters actually match, again, which is the textbook
ideal that rarely if ever happens in the real world, then for the full
resolution to 16 bits or better there needs to be a couple seconds both
before and after the event for the ringing pre and post event to line
up for full resolution. However, that is the inevitable time smear or
blur of the old PCM nyquist paradigm, with it's attendant tradeoff
between time and frequency that has been swept under the rug until now
with the advent of MQA.
It must be emphasized, again and again
ad infinitum it seems, how important matching conjugate encode/decode
filters are for any digital audio, or sampled system in reconstructing
the nuances; DSD does not have that, Redbook CD 44.1 kHz nyquist PCM
does not have that, "hi-res" PCM does not have that; however, MQA does
have matching conjugate encode/decode filters. Ironically, truly lossy
compression algorithms such as mp3, AAC, DTS do have standardized
reconstruction filters, giving uniform mediocrity as their trade-off
for reduced data bandwidth and file size.
Low
pass filters do not “just” roll off high frequencies; they smear fast
events out in time. Think of area under a curve. The high frequency
energy does not magically go away, it is is in effect smeared, spread
out, a pop gets transmogrified into a blub. One can argue about group
delay, linear vs. minimum phase, apodizing, but that is what low pass
filters do: shove it under the rug. Unlike a water filter for instance,
which collects undesirable elements and needs maintenance as cleaning
or replacing, electrical filters redistribute energy. Unless you are a
fundamentalist or flat earther, look at (so to speak) the CMB, Cosmic
Microwave Background Radiation: that is the remnants of a very fast
event, the Big Bang, now spread out in time. The question is how one
can optimize the function, preferably with a matching conjugate filter
between encode and decode of a digital sampling system. Even the
nomenclature of "filter" is a reference to frequency domain mindset;
what MQA does is more akin to conjugate interpolation of the digitized
waveform. Again, MQA is an efficiently done non-brickwall backwards
compatible verifiable end to end conjugate filtering system with
sufficiently low modulation noise, time smear and sufficiently high
time resolution to be truly transparent.
~~~~~~~
To summarize: MQA is
the only digital audio system or format that can reproduce audio with 4
microseconds aperture uncertainty, which is what current neurobiology
indicates is the time resolution of the ear/brain. That is as
transparent as about 2 meters (6 feet for the USA) of air. No other
system or format even comes close. If digital audio was tested with the
same scrutiny that is placed on analog components, only MQA would pass
the test.

Deep
Gratitude to countless generations of Artisans
who listened to the Grand Muse; who practiced,
even mastered the Art of expressing the resonance
and coherence of the Universal Aum;
who crafted these and other Singing Bowls.
Your
memory lives on.