dCS Ring DAC - A Technical Explanation

The following is taken from a response we put on another forum, but I think the information is really pertinent to the discussion here…

To improve the state of the art you need subject matter expertise. Only by understanding the benefits and limitations of various approaches to A-D and D-A conversion can we design a DAC that we feel is best in class. Of course what we deem as a critical performance requirement may not be the same as other manufacturers or hobbyist engineers, however we are one of a select few manufacturers who have designed world-class ADCs, DACs and sample rate converters for both studio and consumer use.

So, we are trying to point out the issues that exist in common architectures, which to sum up are:

You can have lots of weighted current sources at a low sample rate – the challenge here is matching these current sources, and keeping them matched over temperature variations and time. For audio, the side-effect of this is that any errors in this matching cause unnatural distortion (due to correlation), which the human auditory system is very sensitive to. On the plus side, because we don’t have to run that fast, jitter is less of an issue.

You can have conceptually a single current source, and run it at a much higher sample rate. This fixes the matching issue (because it’s self-referencing – on or off, and any drift will manifest as DC rather than distortion). Unfortunately to deal with the quantisation noise generated, you have to heavily noise shape this and move it up in band. This can cause issues because if you keep the clock at sensible rates, the quantisation noise is very close to the audio band, and if you move it too high in frequency jitter becomes a real issue (due to switching noise), at which point you may have to perform some quite horrible (and sometimes impossible) maths to match rates.

What the Ring DAC does is effectively a hybrid – the clock can run at sensible rates (so 3-6MHz) and the noise shaping can be gentle, but because we have multiple codes to represent, we need a way to match them exactly - whilst bearing in mind that components age, temperature can become a factor and so on. This is the job of the mapper, and it has numerous attractions, including distributing DAC errors away from where we are interested (audio frequencies) to where we are insensitive (very high frequencies), without altering the data presented to it, whilst at the same time ensuring all the sensitive components age in the same way.

We can definitively state the that the Ring technology is not multiple DSD streams, and is not random – if you read the writeup, we even say “may appear random”.

It is quite correct to say that you cannot decorrelate noise that is already part of the signal. However, it is worth thinking about this as a philosophical point. One view is that noise shaping and filtering are evil because they somehow ‘guess’ and don’t reproduce the ‘original’ signal, and the ‘fewest steps must be the best’. Now, this may sound strange coming from us, but one of our beliefs can be attributed to Einstein – “Everything should be made as simple as possible, but no simpler ”. So what is the original signal we are trying to reproduce? This will be covered more in the filtering article.

7 Likes

Right, that makes sense. Perhaps a more appropriate term I should’ve used is Noise-Shaping Requantizer. :smiley:

2 Likes

@James the series seem to be halted. Will there be a next post?

I read this expanding thread with great interest, but fear like my understanding of Relativity, which I was taught at A level, I’m somewhat behind the curve. But it is very enlightening thanks.

Will you be at some future point explaining the effects of the filters ( for vivaldi). I’m a little lost despite reading round this as to what their sonic effects are and would be most interested.

Yes listening is recommended, but backing it up with some science would be much appreciated.

Cheers

The series is definitely still going ahead. The next post will be coming very soon, but we’re working on preparing some additional information that we hope will address some of the questions that we have received over the past few weeks.

Thanks for your patience, and I’ll be back with a new post on filtering within the next two weeks.

The next post(s) we have planned are on filtering, so once those are ready hopefully you will feel much more comfortable with the topic!

4 Likes

Part 5 – Filtering in Digital Audio

Most DACs will have some information in their specifications about the types of filtering they use. As these filters are an incredibly important part of the product, it is worthwhile explaining why and how they are used.

To understand why we need a filter, it helps to start at the beginning, when an analogue signal enters an ADC during the recording / production process. (This is significant, as the filter within an ADC has almost as much impact on what we hear during playback as the filter within a DAC.)

We have previously discussed how audio is sampled using an ADC – the analogue voltage is converted into a digital representation, with a series of ‘samples’ being taken to form this representation. The lowest sample rate used in audio is typically 44,100 samples per second (S/s). The reason for using this sample rate (44.1kS/s) is largely due to the Nyquist Theorem. This states that the sample frequency of digital audio needs to be at least twice the highest frequency in the audio being sampled. The highest frequency which can be sampled (half of the sample rate) is defined as the ‘Nyquist frequency’. As the human range of hearing extends up to 20,000Hz, accurately sampling this frequency range requires a sample rate of at least 40,000S/s.

However, what happens if what we are sampling doesn’t ‘fit’ into our sample rate’s valid range, between 0Hz and the Nyquist frequency? If this occurs, then the frequency components above the Nyquist frequency are ‘aliased’ down below it. This sounds counterintuitive, but it is illustrated here:

image

image

The above graphs show two signals: one at 1kHz and one at 43.1kHz, both sampled at 44,100 samples per second (44.1kS/s). Note that sampling the 43.1kHz signal produces samples which are indistinguishable from the 1kHz tone (though phase inverted). If this 43.1kHz signal was passed through the ADC, the resultant samples would be indistinguishable from those of the 1kHz tone – and a 1kHz tone would be heard on playback. This means that the ADC must remove anything which does not ‘fit’ between 0Hz and Nyquist frequency, to avoid these aliased images affecting the audio.

The removal of anything which does not fit between 0Hz and Nyquist frequency is carried out by way of a low-pass filter. This filter removes any content above a certain frequency, and allows anything below that frequency to pass through, ideally unchanged. This filter can be implemented in either the digital or analogue realm.

It would seem that the most obvious solution to the aliasing problem caused by running an ADC at a sample rate of 44.1kHz is to implement a filter which does nothing at 20,000Hz, but cuts everything above 20,001Hz. This would allow the removal of any unwanted alias images from the A/D conversion, while ensuring the audio band remains unaffected. However, such a filter is highly inadvisable. For one thing, if using a digital filter, the computing power required to run such a filter would be excessive. Filters work by reducing the amplitude of the signal above a frequency on a slope so to speak, measured in decibels per octave. As such, the audio is sampled at a higher rate than simply double the highest frequency we are trying to record (it is actually sampled at 44,100Hz instead of at 40,000Hz) which allows some room to filter it. This means the filter can now work between 20,000Hz and 22,050Hz without aliasing becoming an issue, while also leaving the audio frequencies humans can hear unaffected.

image

This diagram illustrates a low-pass filter for 44.1kHz audio.

This is still an extremely narrow ‘transition band’ to play with. If this is done with an analogue filter, the filter will have to be very steep – this is problematic as analogue filters aren’t phase linear (the filter will delay certain frequencies more than others causing audible issues) and are pretty much guaranteed to not be identical. This is okay when they are working at say, 100kHz, but at 20kHz this becomes very problematic. As such, the filter used to remove any content from the Nyquist frequency and up is implemented in the digital domain, in DSP (Digital Signal Processing).

In audio recording, it is common practice to use a high sample rate ADC and perform the filtering at the Nyquist frequency on the digital data instead. This method is known as an ‘Oversampling ADC’. The block diagram for a dCS oversampling ADC producing 16-bit 44.1k data is shown here:

image

The analogue low-pass filter removes high frequencies from the analogue signal above 100kHz, as these would cause aliasing. As previously discussed, this analogue filter acting at 100kHz can be gentle and acts in a region where non-linearities are not as critical.

The ADC stage then converts the signal to high-speed digital data. In a dCS ADC, this stage is a Ring DAC in a feedback loop, so produces 5-bit data sampled at 2,822,000 samples per second.

The Downsampler converts the digital data to 16-bit 44,100 samples per second. This data then passes through a sharp digital filter, which effectively removes content above 22.05kHz. (Frequencies higher than this will cause aliases if not filtered out.) The PCM encoder then formats the data into standard SPDIF, AES/EBU and SDIF-2 serial formats, complete with status and message information.

The digital filter used in the Downsampler will have its own set of trade-offs to employ. To simplify this greatly, digital filters work by passing each sample through a series of multipliers, with these multipliers collectively acting to filter higher frequencies from the signal. The shape of how these multipliers are arranged is referred to as the filter ‘shape’ (symmetrical or ‘half-band’ filters, asymmetrical filters). Different filter shapes have different impacts on the sound.

This diagram illustrates an example of the response of a symmetrical digital filter. They are called this as they produce symmetrical ‘ringing’ when driven with an impulse (also known as a transient). This results in an acausal response before the impulse. The effect is more pronounced at lower sample rates:

image

This diagram illustrates an example of an asymmetrical filter response. This filter type has a completely different impulse response – here, there is no ringing before the impulse, but there is more ringing after the impulse when compared to a symmetrical filter:

image

Given the fact that the ADC must use a filter to remove aliases, and that a digital filter acting at the Nyquist frequency is preferable to using a harsh analogue filter, there will therefore be pre- and/or post-ringing introduced at the recording stage by the digital filtering in the ADC. This is a good trade-off to make, and the filter choice here is important.

Most ADCs will work using a symmetrical filter. What this means is that for any digital recording, there will be (necessary) pre- and post-ringing present on the recording, as a result of the filter which was used. The key point to be made here is that all digital recordings will include ringing from the filters, even before they reach the DAC, but this is the best approach to take – provided the filters are correctly designed and implemented within the ADC.

The other side of this topic is the DAC, where the digital audio recorded by the ADC is translated back to analogue for playback.

When a DAC reproduces an analogue waveform from digital samples, an effect similar to aliasing occurs. This is where, due to the relationship between the frequency of the analogue audio signal and the sample rate of the digital signal, ‘copies’ of the audio spectrum being converted can be observed higher up in the audio spectrum. While these images exist at frequencies outside the range of human hearing, their presence can have a negative impact on sound.

There are two reasons for this. Firstly, frequencies at rates above 20,000Hz can still interact with and have an audible impact on frequencies lower down, in the audible spectrum (between 0-20,000Hz).

Secondly, if these images – known as Nyquist images – are not removed from an audio signal, then the equipment in an audio system may try and reproduce these higher frequencies, which would put additional pressure on that system’s transducers (particularly those responsible for reproducing high frequencies) and amplifiers. Removing Nyquist images means an amplifier has more power available to use for reproducing the parts of an audio signal that we do want to hear, which leads to better performance and a direct positive impact on sound.

Similar to in an ADC, the solution to the problem posed by Nyquist images in D/A conversion is to filter anything above the highest desired frequency of the audio signal by using a low-pass filter. This allows Nyquist images to be eliminated from the audio signal, without impacting the music we want to hear. The question of how a low-pass filter should be designed is a complex and sensitive topic –and it’s important to note that there is no one-size-fits-all solution.

Of course, when working with source material which is at higher sample rates than 44.1kHz, such as hi-res streamed audio, the requirements of the filter in the DAC change. There is a naturally wider transition band, and as such the filter requirements will be different. Most DAC manufacturers offer a single set of filters which are cascaded for different sample rates. Given the different filtering requirements posed by converting different sample rates, this is not the optimal approach to take in a high-end audio system.

For this reason, the filters found within dCS products and the Ring DAC are written specifically for each sample frequency by dCS engineers. Further to this, there are multiple filter choices available for each sample frequency in a dCS product. There is no one right answer to filtering, as it depends on the listener’s preference and the audio being reproduced, so a choice of very high-quality filters bespoke for the Ring DAC and the sample frequency of the audio are available for the user to choose from.

The next post will explore the details of how digital filters are designed for use in audio products, exploring factors such as cut-off frequency, filter length and windowing.

Part 6 - Filter Design in ADCs and DACs​

12 Likes

Very interesting, thanks James.

James this series is fascinating and I really look forward to the next part. One main feature is how well it is written, Even a dummy like me can understand it ( mostly :grin:).

1 Like

Superb. Thanks so much James.

I’m in the middle of arranging a new hifi system after 35 years of not keeping up with that world and your DCS DACs were suggested as one possibility hence I’m reading your articles to understand why they are “different” and so much more expensive. So, my first question:

I don’t understand this issue — if your underlying digital representation is signed 16 bits (i.e, -32768 …32767) for input, how would you ever get a value of 32768 in the first place? In other words, this seems like a non-existent issue in reality and so it’s unclear why this issue would be even mentioned.

Thanks,
D

Digital representation is not signed.

Digital representation is not signed.
For a value of 32767 the lower significant bits 1 to 15 are required, highest significant bit 16 is set to zero.
For a value of 32768 only highest significant bit 16 is required, lower significant bits 1 to 15 are set to zero.

Thanks for responding.

Oh I see-- this is not about signed vs unsigned — this is an issue of changing 15 bits to 0 and one bit to 1?

If so, then why isn’t this a problem going from 2^n - 1 to 2^n for any value of n, not just when you are at 2^15 -1 ?

Changing all 16 bits is the worst case.

Part 6 – Filter Design in ADCs and DACs

How do the digital audio filters discussed in Part 5 work, and how should they be designed? This may sound counterintuitive, but the filtering required in both ADCs and DACs is actually very similar.

In an ADC, the audio is coming into the converter at a higher rate than we want to output – typically by a power of 2. To deal with this, the converter must remove content above the Nyquist frequency, which allows it to drop samples (allowing it to lower the sample rate from 88.2k to 44.1k for example) without content from above the Nyquist frequency aliasing down below it. This is handled in Digital Signal Processing by way of low-pass filtering.

In an oversampling DAC, audio is coming in at a lower rate than we want to feed to the converter. There are several approaches for how to tackle this issue, but by far the most effective is to insert samples with an amplitude of zero between the actual audio samples to increase the sample rate, then low-pass filter the signal to remove the Nyquist images this process creates. Once again, DSP is used to implement low-pass filtering.

How is this digital low-pass filtering carried out? In simplified terms, the digital audio signal is run through a series of ‘coefficients’ – multipliers which change the amplitude of the audio sample by an amount, defined by a number between 0 (no output) and 1 (the original full amplitude of the sample). Each of these coefficients is what is referred to as a ‘tap’. A higher number of taps means the signal is run through a larger amount of coefficients in the filter. The output of the filter at any one point is the sum of all of these coefficients multiplied by the respective samples.

Audio websites and magazines often feature ‘impulse response’ plots of filters in audio products. These typically show the output from a filter when it is given samples, all with an amplitude of zero, then a single full-scale (all 1s) sample, then all zero samples again. The effect of this is to show the coefficients in a filter, which define how the filter works. Typically, this filter is a derivative of what is known as the ‘sinc’ function. The sinc function is defined as sin(x)/x and looks a little like the below graph.

image

There are several useful properties of the sinc function – it acts as an ideal low pass filter (the set of coefficients used in a digital filter, as previously described, are taken from this function, hence the Y axis being shown as ≤1, which will be shown below); it is mirrored in time in both directions (before and after the impulse, the single full-scale sample shown in the middle of the above sinc graph) and it offers the same delay at all frequencies (not all filters do this). An analogue filter will delay some frequencies more than others, creating phase issues. Filters which offer the same delay at all frequencies are referred to as phase linear.

Given these factors, the sinc function can be manipulated to provide the desired frequency response. The three main factors that can be adjusted with digital filters are:

  • The -6dB point – the frequency at which the filter reaches 6dB of attenuation.
  • The filter length – the number of coefficients used in the filter, with one coefficient often being referred to as a ‘tap’.
  • The windowing technique – this is linked to both of the above factors (this is a simplified explanation).

Cutoff Frequency (-6dB Point)

To explore the factor of the cutoff frequency of a filter, take the below graph, which shows the most commonly used digital filter. This is a ‘half-band’ or ‘Nyquist’ filter, with coefficients based off of the sinc function. This filter is designed in such a way that the -6dB point is at the Nyquist frequency of our target. For a low-pass filter in, for example, an ADC, the Nyquist frequency is set at 22.05kHz. For a tap length of 128, this generates the below impulse response.

image

The crucial aspect of this filter is that every other coefficient (the data points on the graph marked by an X) is 0 – which makes it twice as computationally effective as a non-Nyquist filter (as using a coefficient of 0 always results in an output of 0). The drawback to this type of filter is that by definition it only reaches 6dB attenuation at the Nyquist frequency. Thus, aliasing artifacts from the ADC which will be mirrored down to below Nyquist frequency will not be correctly attenuated. We would therefore ideally want the -6dB point to be below the Nyquist frequency, to remove said aliases and to maintain good attenuation at the Nyquist frequency.

image

Windowing Technique

There are some drawbacks to using the sinc function as the basis for a digital filter – firstly, the filter would be infinitely long (due to the sinc function being a mathematical function with no defined length), which is problematic in the real world. It also requires data from the future, as the filter coefficients include data from before the current sample being played. So how should these issues be addressed?

A good start would be to define the filter as not infinitely long. Doing so would allow the incoming audio signal to be buffered by a certain number of samples. This means the filter can then effectively have some data from the ‘future’ by delaying the entire audio signal by that amount of samples.

However, defining the filter with a finite length leads to some mathematical problems – the sinc function starts to get very small when moving further away from the central impulse, but not so small that the function becomes irrelevant. Sinc is infinitely long, but it is unwise to simply select a finite section of the sinc function to use as the basis for a filter, chopping off the ‘start’ and ‘end’ of the function to create a finite length. Doing so actually causes the resultant filter to not work very well. The below graph shows an example of a digital low-pass filter designed to work for CD rate audio, with a tap-length of 64 taps, where the ends of the sinc function have simply been chopped off to provide the coefficients used in the filter.

image

As can be seen from this graph, this filter does not provide a good response. The stop-band rejection above 22.05kHz is poor, and there is a significant ripple in the passband (below 22.05kHz). This problem is best addressed through a technique called ‘windowing’. This involves taking the coefficients for the filter we want to use, such as a finite section of the sinc function, and then multiplying them by another set of coefficients to reduce the negative effects seen above.

There are many approaches to windowing (different sets of functions / coefficients) used, but for this example, a raised cosine window will be used.

image
This diagram shows an example of a Raised Cosine window.

Applying this window to the above 64-tap filter results in the below frequency response:

image

As can be seen, this provides drastically better results. The stopband has far better rejection and the rippling seen in the passband is no longer present. What this means is that when designing a filter for use in audio, the windowing function needs to be selected carefully for correct filter performance.

Filter Length

The third factor to consider with filter design is the length of the filter itself. As previously discussed, we need to filter the output of a DAC to prevent imaging. This filter needs to be a finite length – longer than nothing but not infinitely long. If this is not done and no filtering is carried out to the DAC’s output, and the samples are played back as they come in, the result is not ideal:

image
This graph shows the frequency response of a Non-Oversampling (NOS) DAC with no filter on the output.

As discussed in the previous article, a filter for 44.1k digital audio should not affect signals below 22.05kHz, but should heavily attenuate them above this. With no filter in place, not only is there very little attenuation above 22.05kHz (thus leading to lots of false high frequency components), we are also affecting the signal we are interested in: the frequency response at 20kHz is -3dB.

So, how long should the finite length filter on the DAC output be? To illustrate this factor, the same previously mentioned Nyquist filter with a raised cosine window will be used. Firstly, a 32 tap filter:

image

image

As can be seen from the above graphs, the frequency response is still far too droopy to be appropriate for use – at 20kHz, the response is still down about 2.8dB. Attenuation for images above 33kHz (or for images in the region of 0-11kHz) are 50dB down. The transition band width is in the region of 20kHz.

Next, the same Nyquist filter with a raised cosine window will be used, but this time with a length of 256 taps:

image

image

This response is visibly much better. The pass-band response is very flat up to 20kHz, and the transition band is approximately 4kHz wide. Images above 22.05kHz are supressed nicely. Looking at the impulse response however, it can be noted that much more data from before the current sample is playing is required. As such, a larger number of samples from ‘the future’ are required for this filter and will subsequently be having an impact on the current sample, and more samples from ‘the present’ will have an impact later on.

Lastly, using the same example upped to 1024 taps:

image

image

As can be seen here, lengthening the filter has a few effects. Firstly, the transition band has become much narrower, so there is less out of band energy. There are, however, some negative effects – because there are more coefficients performing more multiplies, the stopband noise rejection is actually starting to degrade. Noise is beginning to accumulate in the stop-band (which can be partly compensated for with the windowing). There are also now a large amount of samples from the ‘future’ being used (in this example of 2014 taps at 44.1k, around 11mS worth) and an equal amount will be affected in the ‘past’. The effects of these factors are debatable, but where these samples come from has to be considered. With that in mind, a good real-world example…

dCS 904 – 44.1k, Filter 1

image
This graph shows the frequency response of the dCS 904 running at 44.1kS/s using Filter 1.

This graph shows the filter frequency response from a very capable ADC, used in many recordings – a dCS 904. The first thing to note about this example is that it doesn’t use a Nyquist filter like the examples above. The attenuation by the Nyquist frequency is 20dB. This is important as an ADC, by definition, deals with signals that are not bandlimited. The final filter is around 100 taps long, meaning that effectively there is little to be gained from using a much longer filter on replay (inside the DAC). Considering what this response means, the following chart is helpful:

image

This graph helps to illustrate the area of uncertainty caused by this filter. In the intersecting area, it isn’t possible to differentiate between real signal and an alias. As such, use of excessively long filters here leads to heavy use of DSP to preproduce what is effectively the transition band of the ADC – signal which is undesirable, and unknown.

What this stands to show is that with digital filter design, the signal chain as a whole needs to be considered, as opposed to just the DAC in isolation. DAC filters which are likely to work well with realistic ADC filters are ideal – in reality the use of a filter which is either not present, too short or too long in a DAC can have detrimental effects. Filter length of course must be balanced with the other factors described above, which is where good engineering comes into play – understanding how to employ the necessary trade-offs to create a set of filters which work well regardless of what content is thrown at it. This is the reason a dCS DAC has so many filters to choose from – the DAC doesn’t (and can’t) know the filters which were used to create the signal, so several options allow the user to achieve the best musical experience irrespective of source material.

dCS’s experience with both ADCs and DACs leaves us in a very strong position to be able to create DAC filters which consistently perform to the highest standards, both in testing and with real-world musical signals.

9 Likes

Hi folks,

Following our series on D/A conversion, we have a new set of posts to share. These posts will focus on another important aspect of the D/A conversion process, which is clocking. Whilst a DAC’s circuitry is responsible for making sure the right voltage is generated on conversion (ideally without correlated errors), clocking is responsible for ensuring the conversion happens at the right time. This series aims to explore why this is important in digital audio, along with good practices related to clocking.

We’ll start with discussing the basics of clocking and what happens when clock systems fail to produce an accurate signal. We’ll also examine the effect this causes, jitter, and the consequences this can have on digital playback, before going on to look at clock synchronisation in audio systems with multiple components, and the role of clocking in relation to asynchronous audio, such as music streamed over the internet or played via USB. Alongside this, we’ll provide an insight into the design of both dCS clock systems and dCS Master Clocks, and explain the steps we take to ensure our clocks deliver an accurate and consistent signal.

Our aim is to provide some helpful information on the role and importance of clocking in digital audio, and answer some common questions related to this topic, as well as providing a deeper look at our approach to developing clock systems and Master Clocks (something we’ve been doing for over three decades). We hope you’ll find it useful.

2 Likes

Part 1 - The Basics of Clocking

Clocking is an integral part of digital audio, and almost all audio products have a clock inside. As discussed in our previous series on digital to analogue conversion, digital audio recordings are made up of a series of samples. An audio product such as a DAC has to know when to do something with the samples it receives at its input, whatever this task might be – and this is where clocks come in.

In the context of digital electronics, the term clocking refers to a signal that keeps all of the circuits within a system in sync and operating at the same time. In order to generate a precise and reliable signal, the clock system must have a source: something that defines how long a period of time is. This source usually comes in the form of an oscillator – an electrical circuit that provides a regular rising and falling of voltage.

At dCS, we use quartz crystal oscillators as the basis for our clock systems. Quartz is a piezoelectric material, meaning that when a voltage is applied to it, it physically deforms and flexes back and forth. The crystal can be designed to resonate mechanically at a particular frequency (for example, every 44,100th of a second) and with a correctly designed electrical circuit, this resonance can be converted into an oscillating voltage.

The frequency of the crystal’s resonance lets a system know how long a specific increment of time (such as 1/44,100th of a second) is. Through measuring these increments of time, a system can accurately space audio samples apart. This avoids any unwanted movement of the samples in time, which, if it occurred during the digital to analogue conversion process, could cause distortion of the audio signal heard during playback.

The design of clocks in digital audio (both internal clock systems and external master clocks) is a topic worthy of serious consideration when purchasing any high-end audio system. Arguably, clocking can have as much of an impact on sound quality as DAC circuitry, and it’s vital to consider the design and implementation of the clocking system as a whole, rather than selecting simply selecting components that have an impressive specification on paper.

As the clock defines the timing of a DAC’s operations, it is responsible for ensuring the samples are converted at the correct time, which is crucial to ensuring the audio we hear during playback sounds as it should.

Jitter

If a clock system fails to produce a signal correctly during the D/A conversion process, or the signal is unable to correctly reach a DAC, then we experience something called jitter, which is highly undesirable in audio.

Jitter is described as any irregularity in the timing of the clock used by a DAC, and it is produced in a variety of ways. It can be the result of bad analogue design, electromagnetic interference, poor quality digital audio cable or a number of other causes, which we’ll discuss in future posts.

The actual audible effect of jitter depends upon its nature, but it can have a notable impact on sound. If jitter is periodic, sidebands will appear either side of the signal frequency. This sounds like a harsh distortion, as artificial components are being added to the audio. If the jitter is noisy in nature, this results in a ‘smearing’ of signal energy. This, in turn, increases the noise floor of a system, which has the effect of masking fine detail in the music.

The above graphs show an example of what can happen with poor clocking. In both examples, a sine wave has been reconstructed by a DAC using 25 samples. Each of these samples has exactly the same amplitude in both graphs; the only factor which has changed is the timing of when several of the samples are converted. The result is a visible degradation of the signal. Were this signal to be played back through a transducer, the signal in the lower graph would sound noticeably worse than the top due to the jitter.

While the example above is rather exaggerated, it illustrates that the right sample at the wrong time is the wrong sample. This goes to show just how vital accurate clocking is to a digital audio playback system.

The human ear and brain are extremely sensitive to irregularities in the timings of sound. If a DAC experiences jitter, and fails to convert signals into analogue voltages at the correct time, then the sense of space in a performance can be heavily skewed or even lost. It’s for this reason that dCS engineers take great care to minimise jitter in all aspects of our design.

If jitter is introduced at the recording stage, it will remain in the signal forever. There are steps that can be taken to prevent further degradation of the signal (such as re-clocking or even buffering the signal into RAM and out again) but it isn’t possible to correct or remove jitter that arises during the recording process.

At the playback stage, jitter should be minimised wherever possible to avoid the signal we hear being compromised or altered. Provided a recording is of decent quality, having a DAC reproduce audio samples at exactly the right moment will help to ensure that listeners experience an accurate representation of the original sonic event. A signal coming into a DAC from an external source, such as a CD transport, can actually have very irregularly spaced samples on arrival (to a point, which we’ll cover in future posts), but provided the DAC itself converts those samples at regular intervals, the sound quality will remain unaffected.

Our next two posts will cover the main kinds of jitter: intrinsic and interface.

9 Likes

Once again thanks James. What fascinating series . How many other manufacturers would even bother?

2 Likes

Many other DAC manufacturers don’t have much to write about - they use a standard DAC chip, choose/implement some filters and analog stages and that’s it. Pretty short stories I’d say :wink:

3 Likes

Part 2 - Jitter (intrinsic)

There are two main kinds of jitter: intrinsic and interface. Intrinsic jitter refers to jitter which is produced inside of a product like a DAC, through effects like phase noise on the oscillator. Interface jitter refers to jitter which is picked up by the interface(s) used to transfer the audio and clock signals. This could come either as interference picked up by the cable itself, or through the cable essentially acting as a filter for certain frequencies, impacting the integrity of the square wave (the output of the clock circuitry) passing through it.

There are several varieties of quartz crystal oscillators, but Voltage Controlled Crystal Oscillators (VCXOs) and Oven Controlled Crystal Oscillators (OCXOs) are two of the most common in audio.

Voltage Controlled Oscillators, or VCOs, are also used in audio products, but these operate on a purely electronic basis, and do not use an electromechanical material such as quartz to generate signals.

Quartz oscillators tend to have better phase noise performance than VCOs, meaning the oscillator itself may be less prone to jitter. What this means in terms of overall clock design is that in a DAC with a quartz oscillator, the Phase Locked Loop, or PLL (the circuit which matches the frequency of the DAC’s clock to the clock of the incoming audio signal) can be biased towards rejecting interface jitter by means of a narrower PLL bandwidth.

This is possible as the quartz oscillator itself is less prone to phase noise and subsequently jitter. As such, if jitter should happen to be present on the interface, for example a jittery AES signal, this jitter will not be passed on to the DAC, as it will have come and gone before the PLL reacts to it. The DAC instead relies more on the oscillator for timing accuracy between individual samples which, in the case of a dCS product and its quartz crystal-based clock, is a very high level of accuracy.

The alternative to this would be to use a VCO as the oscillator. However, given the potentially poorer phase noise performance of a VCO compared to a quartz oscillator, the PLL within the product may need to be biased towards rejecting intrinsic jitter, as the oscillator itself would likely be more prone to phase noise, which can be achieved through using a wider bandwidth PLL. This means any interference or cable filtering effects will have a more direct impact on the sound which, in some use cases, may be undesirable.

If this is the case, you might be wondering why any product would use a VCO as a clock source. One benefit of using a VCO over a quartz oscillator is the possibility for the clock to have a greater ‘pull range’, meaning it can lock to a wider range of signals (for example, signals running consistently too fast or too slowly).

In our experience, in the context of a high-end digital audio playback system, the quality of a DAC’s clocking should not be permanently compromised to allow for a sub-optimal source to be connected and work properly. At dCS, we opt for using a quartz-based oscillator with a high level of accuracy and stability, allowing for a pull range of +/- 300 parts per million (PPM), as per AES specification.

These graphs show an exaggerated example of the effect of jitter on the square wave output by a clock circuit. As previously discussed, jitter has both impacted the transition times of the wave and the peak voltages the wave can reach. This has the effect of changing the point at which the system would perceive, for arguments sake, a 0 changing to a 1. Clock systems watch the ‘rising edge’ of the clock signal where the voltage increases, so the point on the rising edge where the amplitude goes above 0.5 has been marked on the graphs. The timing of this is regular in the first graph – the transition points on the rising edges fall on 2, 4 and 6 on the X axis respectively. When jitter is introduced, the transition points are shifted forwards or backwards depending on the nature of the jitter. It is not regular, it is random.

There are several factors that can cause phase noise (and, consequentially, jitter) on an oscillator, and these should all be taken into account when designing a clock system.

Physical Vibrations

As the basis of a quartz clock is its piezoelectric property (the physical movement of the crystal when a voltage is applied), any external physical vibrations can cause inaccuracy in the clock. The extraneous movement does not need to be vigorous to cause inaccuracy. It could be as subtle as, for example, the vibrations of a CD mechanism inside the product. Any measures which can be taken to isolate the clock circuitry from the external physical vibrations should be taken, as this creates a higher level of clock performance.

Power Supply

The ability of a quartz crystal (or any piezoelectric material) to maintain a consistent oscillation frequency relies on having a stable, interference-free DC signal of the correct specification. In the case of both VCXOs and OCXOs, this means clean DC for the power supply. In the case of a VCXO, even more crucially, the control voltage needs to be stable (in this type of oscillator, the control voltage is used to make fine adjustments to the crystal’s frequency). If there is any variation in the power rails fed to the crystal, the frequency it resonates at will change. In products with a quartz-based clock, designers should always endeavour to have as clean a power supply fed to the crystal(s) as possible, at the correct voltage and frequency for the specification of the region the product is being used in.

Crosstalk

Electronic circuits can generate electromagnetic leakage. This is often seen when running high-rate signals such as digital audio signals through the copper tracks found on PCBs (printed circuit boards). The copper tracks essentially act as antennas, with the digital audio signals being radiated from the board. This interference can have an impact on associated clock circuits if they are in close proximity, negatively impacting the performance of the clock.

The correct way to eliminate this issue is to design the product’s PCBs in such a way that minimises crosstalk. Secondary to this is to ensure that any sensitive parts are separated from those which may cause interference. An even more effective method is to completely remove many potential sources of EM interference from the product, where possible, such as with the use of a Master Clock (a standalone clock with its own dedicated circuitry and power supplies).

Clock Frequency

The ideal way to design the clock inside an audio product is to have two oscillators: one running at direct multiple of 44.1kHz, and the other running at a direct multiple of 48kHz. The reason for this is that almost any sample rates used in digital audio are multiples of these ‘base rates’ (including DSD, which runs at very high multiples of 44.1kHz). If the clock does not use direct multiples of the sample rate it is to be clocking, the maths becomes more complex and the electronics that need to be used to generate the correct frequency are more prone to jitter.

Trying to clock a 44.1kHz signal with a 10MHz clock, for example, would require somehow synthesising 44.1kHz from 10Mhz, which mathematically is not clean. As such, this type of clock will need to use methods such as asynchronous rate conversion to multiply the rate down correctly. These methods invariably result in a ‘dirtier’ frequency spectrum of the clock signal, meaning that the system will be more prone to jitter.

dCS products use two oscillators, running at 2^9 of the base audio rates (44.1kHz and 48kHz), so 22.5792MHz and 24.576MHz. The easy division down to any required rate results in a cleaner clock spectrum and, as a result, less jitter.

Clock Temperature

While clock temperature is not a source of phase noise, it can affect the performance of an oscillator. The resonant frequency of a quartz crystal is inversely proportional to its size and, by extension, its temperature. As the temperature of the quartz increases, it physically expands. As the temperature decreases, it contracts. This causes changes in the resonant frequency of the quartz, as the physical size is now different. Temperature variations in digital systems should therefore be avoided, or the effects mitigated wherever possible.

There have been several methods for counteracting temperature variations inside a crystal oscillator. One approach is to use an industry-standard OCXO. An OCXO aims to remove the temperature variation of the crystal by using a Curie heating element to keep it at a stable temperature. The Curie device is a resistive heater whose resistance increases sharply when it reaches a certain temperature, effectively cutting back the heating power. The temperature will overshoot and then stabilise around the required temperature. When the temperature of the product is not stable (such as when it is powered on from cold), due to thermal delays, there will be some fluctuation in the temperature and, consequently, the frequency as the system ‘hunts’ around the target temperature. Once the crystal’s temperature has stabilised, however, the clock will output a stable frequency.

Another approach is to use a microcontroller-enhanced VCXO, as we’ve done in a number of dCS products. This approach does not use any heating elements to account for temperature variation. Instead, we utilise the large amounts of processing power available thanks to the FPGA-based design of our products to make constant adjustments to the control voltage fed to the VCXO to compensate for temperature changes.

In the case of a dCS Master Clock, such as the Rossini Clock or Vivaldi Clock, these adjustments are based on intensive measurements taken during production. During the production process, we place the clock (and the circuit board the clock is fixed to) into an Environmental Chamber. This chamber measures the clock frequency against the current controlled environmental temperature and records it onto the FPGA inside the product. The temperature is then changed, the clock measured, and the performance again logged. This process is repeated over 18 hours. This enables us to plot exactly how the VCXO in the Master Clock behaves at any given temperature, which the product has a record of.

This data is actioned in the product by adjusting the control voltage which is fed to the VCXO. A higher or lower voltage will create a higher or lower resonant frequency. This, combined with the product’s knowledge of its performance against temperature, ensures the clock’s output frequency is always stable. At any given normal operating temperature, the clock’s output frequency will be consistent.

This is a constant process inside the Rossini and Vivaldi Clocks, with the clock temperature regularly measured, and the control voltage adjusted if the clock temperature has varied. The result is that a new Vivaldi Clock, for example, can achieve an accuracy of above +/- 1 PPM when shipped. Once the clock has stabilised in its environment, the accuracy typically increases to +/- 0.1 PPM.

In the next post, we’ll look at the other main kind of jitter: interface jitter.

Part 3 - Jitter (Interface)

10 Likes