Sorry, yes, I was referring to post-processing such as the remodulation of DSD256 to DSD512. Obviously a recording captured natively at DSD256 contains more information than one captured natively at DSD128. But how much of that is musical (i.e. actually audible) information? I don’t know.
In PCM the difference between the actual information captured at 96kS/s vs 192kS/s is completely inaudible (audio frequencies of 48kHz and up, totally ultrasonic), it is the difference in the filter that can be used and the audible side effects thereof that causes the perceived difference in sound quality.
I strongly suspect that similarly, most (all?) of the increased payload in increasing bitrates of DSD is ultrasonic and inaudible. But that is just my guess, as I said my understanding of DSD is pretty rudimentary. I will discuss with James next time I am in the office and see if I can get a better understanding.
In the meantime this article by the ever-dependable Archimago provides some insights:
I think the general term you’re looking for is “interpolated” sample-points.
Whether for PCM or DSD, and whether offline (eg. Merging Pyramix DAW) or done in real-time (eg. Vivaldi Upsampler, HQPlayer etc.), interpolation is responsible for converting from a lower bit-rate to a higher one with new music data sample-points interpolated from existing samples. The new additional sample-points, though not from actual capture, naturally increases the file size (or bit-rate for real-time upsampling).
In the case of DSD-to-DSD upsampling like for DSD256-to-DSD512, the interpolation is I believe a 2-stage process; decimation of the lower rate DSD, and then re-modulation of that to the higher rate DSD.
I actually wasn’t trying to venture down the rabbit hole on post-recording DSP with either format, but was reacting to–apparently my misunderstanding of-- the statement that the higher sample rate files do not contain more information. If they are originally recorded at the higher sample rate, I.e., PCM 192 v 44.1, or DSD256 v DSD64, they clearly do.
Re: choice of PCM or DSD, in both cases, if one is increasing the sample rate of an audible signal, the amount of audible information captured must also increase.
I’m afraid I’ve taken us off topic from @Lee’s thoughtful review (apologies!), but am happy to continue under one of the DSD threads.
Not necessarily. In PCM increasing the sample rate enables you to capture higher frequency sounds. But if they are ultrasonic they are inaudible (although your amp doesn’t know that so it waste a lot of energy amplifying them if they aren’t filtered out). So increasing the sampling rate of an audible signal does not necessarily increase the amount of audible information captured (unless it is below 44.1 kis/s to start with). I am as interested as you are to understand how this plays out in DSD.
Just pointing out that the difference in file size is not trivial.
A DSD256 file is 300% larger than a DSD64 file. In the example I shared, it’s more than 180MB larger, for just one song.
If only a small fraction of that storage is audible, I would be very, very surprised. Furthermore, it would shed a very negative light on Sony’s engineers, which, IMHO, would be even more surprising
Great explanation by Anup and excellent comment by Richard.
I always questioned the concept, which the download services tried to sell (wink, wink), to get people to pay more to purchase higher bit rate files. The “implication” was there was more resolution and detail to be heard. How could they take something recorded, for example, in the 50’s, and then miraculously add additional recording information and give it the famous “hi resolution” tag - implying increased resolution and detail.
Hi Res was not even invented yet - so how did some of Louie Armstrongs early ecordings suddenly become “hi res”?
What they are doing for these older recordings is going back to the original analog tape from the 50s, and then playing and resampling it at an higher, HD rate, e.g., 192kHz.
So, they are not capturing more information than what was originally on the tape, but they are capturing a lot more information than what was lost by previously sampling the tape to the redbook CD format of 44.1 kHz.
As an alternative, one could, as @Dunc on this forum has done, buy the old tapes and an old tape player, and just play the analog tape this way. In this manner, one remains entirely in the analog domain. The digital domain, for many, is simply more convenient.
On good sites like HD Tape Transfers and NativeDSD they are usually very specific about provenance, and you can read more about this in the notes. As a general “rule of thumb” you will find original HD recordings in 192kHz, occasionally DXD (384kHz) and DSD64 - DSD256. You will virtually never find an original recording at DSD512, and all of those files are upsampled post production (I personally avoid them).
On your system, for example, you should be able to notice a meaningful difference between the original “Kind of Blue” redbook CD, for example, and a resampled HD version from the original tape/record press.
Extremely helpful explanation Richard - thank you.
Taking it even further (with reference to the quoted passage), one could say playing the original vinyl record yields more information than many differing resolution digital files.
Both tape recorders (electrical signal + magnetizable material onto tape) and vinyl records (vibrations + stylus into etched lacquer master) remain in the analog domain and therefore have no A2D losses (though they could have source-to-recording, i.e., A2A losses).
Not sure I follow. Why would it be Sony Engineer’s fault? (That DSD isn’t an efficient lossless compressed format?)
I believe as was mentioned previously (and may have been discussed in the extensively debated DSD256 support topic ), the primary value of high-rate DSD is to push noise-shaped quantization noise even higher into the ultrasonic range, not necessarily higher resolution.
Ostensibly, DSD128 already pushes it well beyond where it could impact the audio band from DSD64’s 20kHz to above 40kHz. DSD256 I believe doubles that to move quantization noise above 80kHz. Is that really necessary, and maybe more importantly, is there any real-world improved audibility? I don’t think there’s any objective data one way or another
One of the leading proponents of DSD asked that exact same question when Native DSD was launched and he even supplied DSD 128 and 256 files for people to listen to. I ‘think’ the consensus was one could not hear a difference. My assumption was (and still is) that DSD 256 is an option for engineers to capture mic feeds where the noise is out of bounds as Anupc notes - thus making it easier to engineers to post process without having to deal with said noise
You could be correct about the quantization noise.
Regardless, this does not change the fact that the analog sound in the audible range is still being captured at 2x the sample rate of DSD128 and 4x that of DSD64.
I just don’t see how one can argue that more audible information is not being captured.
It would be like saying there is no difference between a 44.1kHz and a 192kHz recording in PCM…
OTOH, the fact remains that beyond a certain resolution, the human ear can’t perceive any difference. Just like the human eye can’t actually perceive a difference beyond about 15 megapixels.
There’s no objective evidence that the resolution difference between DSD128 and DSD256 is perceptible or results in improved sound quality.
Whereas it’s important to capture sound or images of a high-enough resolution such that editing the media doesn’t impact the perceptible range.
When DSD experts who actually worked on the format like Andreas Koch explain that DSD256 has no real advantage in playback system, I tend to believe them
Sorry RG, but I think you have misunderstood this. Your argument is invalid because your ‘facts’ are incorrect.
There is no difference between 44.1kS/s** and 192kS/s in the audible frequency range. A 44.1kS/s sampling rate allows you to capture every frequency up to 22.05Hz perfectly. There is no better. That is Nyquist-Shannon sampling theorem and it is proven.
Having 192kS/s expands the sound frequencies you can capture up to 96kHz which is ultrasonic and inaudible. An analogous effect applies to higher rate DSD although the math is different. DSD64 is sufficient to capture all audible frequencies. Higher sampling rates allow you to sample higher frequencies, they don’t give you any more information or resolution in the audible domain.
It’s not like hi res video where going from 4K to 8K gets you a higher pixel density. A better analogy would be that you get the same numer of pixels but each pixel is now able to reproduce ultraviolet light. There’s more information, yes, but it’s invisible to us.
What these higher sampling rates do do is allow you to move the reconstruction filter (in the case of PCM) and quantization noise (in the case of DSD) up to higher frequencies and away from the audible band. Any audible improvements are a result of these two effects alone.
** I’m using the standard notation of S/s (samples per second) for sampling rates and Hz for audio frequencies just to distinguish the two. When I use the terms “audible/inaudible” and “ultrasonic” in all cases “to human beings” is implied.
First, thank you Anu for sharing this interview. I encourage anyone interested in DSD to watch it.
Second, respectfully, your summary is not consistent with what Andrew stated. In the interview, Andrew clearly states:
“DSD is closer to analog than PCM” and that this has been shown consistently in listening tests.
He also states, beginning at 25:00:
“The advantage in moving to 256 wasn’t really that great for the DAC, but for A-to-D there definitely is merit and there is an advantage.”
His claim that DSD128 is the “sweet spot” is made not because there is not an underlying advantage to DSD256–he clearly states there is–but, because the signal-to-noise ratio is reduced, it is more difficult to edit/process DSD256, if processing is needed.
Earlier he explains why microphone placement is so important with DSD256, to limit post-processing.
In summary, the interview is not a rebuke of DSD256 at all, but a reaffirmation of its advantages, with a few caveats for recording and playback.
@AndrewS, I confess I don’t understand your comment at all. If an analog signal is sampled at a higher frequency, and the bit depth remains constant, by definition more information is captured. This is confirmed in the file sizes.
I think you meant Andreas (not Andrew ). Also, the point you make is really quite irrelevant to our DSD128 vs. DSD256 debate.
He’s making the exact point that I mentioned; that the difference between (DSD64), DSD128, and DSD256 is in where quantization noise is. NOT as you contend because it sounds better, or that it carries more musical information.
Please listen carefully from @ 24.04min into the YouTube video;
DSD64 - “was a little bit close to the thresholds” (of the 20kHz audio band)
DSD128 - “we didn’t have to pay to attention to the edges and filter slopes. A really nice format”
DSD256 - “It didn’t really make a big difference because whether the filter threshold is 40kHz or 80kHz, who cares”.
Which is why DSD128 is the sweet-spot.
Once again, the debate here is about DSD256 for playback, not recording.
I never said it was. Only that DSD256 is not necessary for playback, and DSD128 is good enough.
In fact, Andrew makes an excellent point in his post. Good reminder of first-principles; the Nyquist-sampling rate (44.1kS/s) already captures every analog signal in the 20kHz audio band in order to perfectly reproduce the original musical event.
The only reasons for higher sampling rates are;
(1) to ameliorate problems associated with traditional brickwall filters impacting the audio band
(2) to capture frequencies higher than 20kHz which may contain some musical information.
There’s no added musical information, within the 20kHz, thats captured by higher sample rates (PCM or DSD) beyond the Nyquist rate.
I see. Well, since a higher sample rate, even in the audible frequencies, doesn’t matter, then let’s all sell our dCS gear and return to listening to MP3!
When I upgraded my car from a BMW to a [XXX], it also “didn’t really make a big difference.” The new one was slightly faster, slight responsive, etc., but who cares, the old one was “good enough.”
(!)
I honestly am astonished by the negative sentiment toward DSD256 on this forum. One of the key engineers in the space says it is clearly better for recording (A to D), and has minor advantages in playback (D to A), with some caveats.
I would have thought that attempting to get closer to analog sound through higher quality digital would be a shared passion. But, to each his (or her) own
I’m happy to be in a group of one on this if that is the case!
Audible range: ~10% of sample size can detect up to 28kHz
ASA, 2007:
"Hearing thresholds for pure tones between 16 kHz and 30 kHz were measured by an adaptive method. The maximum presentation level at the entrance of the outer ear was about 110db SPL.
"To prevent the listeners from detecting subharmonic distortions in the lower frequencies, pink noise was presented as a masker. Even at 28kHz, threshold values were obtained from 3 out of 32 ears.
"No thresholds were obtained for 30 kHz tone. Between 20 and 28 kHz, the threshold tended to increase rather gradually, whereas it increased abruptly between 16 and 20 kHz.