Diretta 3-tier setup with upsampler

stevebythebay · November 17, 2025, 5:23pm

I’m planning on testing a configuration of Roon which David Snyder (https://github.com/dsnyder0pc/rpi-for-roon/blob/main/Diretta.md) has put together. It involves feeding the output from a Roon Server over the network to a pair of Raspberry Pi units running AudioLinux and the Diretta protocol along with Roon Bridge on the Target RP. It’s a Host→Target ultimately connecting to the Upsampler’s USB 1 port. I’m just wondering if anyone else has explored this approach. The objective of this alternative is to reduce all manner of noise and smooth the delivery of the bits into the DAC.

Omni · November 17, 2025, 9:36pm

I’ve followed the Diretta thread on pink fish. Mr. Snyder seems to be a little “cultish” to me. What he’s espousing seems trivial at best, nonsense at worst. No I won’t be trying it. I really don’t need yet more boxes in the audio chain.

Anupc · November 18, 2025, 2:07am

Professionally, I deal with networking hardware/software design calls and IETF RFCs on a practically daily basis. Having looked at Diretta when it was first introduced by SoulNote, it was pretty clear that the basic premise of why Diretta is even needed was so fundamentally flawed that it literally borders on conspiracy theory.

And like all conspiracy theories, it’s built on a foundation of wildly misinterpreted facts. So when people read the background, because there are some facts in there, the premise appears plausible.

Without going into all the technical details of why Diretta is really not worth anyone’s time, just the following alone should be a red flag;

The second Raspberry Pi, the Diretta Target, connects only to the Host Pi via a short Ethernet cable, creating a point-to-point, galvanically isolated link. It receives the audio from the Host and connects to your DAC or DDC via USB.

Connecting a Raspberry Pi’s USB port to your dCS DAC is a recipe for (sonic) mediocrity. Any supposed benefit from Diretta’s lower noise from non-bursty packet processing will be completely lost.

stevebythebay · November 18, 2025, 2:25am

Well I’ll let you know if I find it to positively impact my listening experience vis a vis my current Grimm MU1 to dCS Upsamper / DAC via AES/EBU. I’ll be surprised if my current implementation is not better.

I did recently update the MU1 with v2 firmware that supports DLNA, and have found using UPnP via NAS minimserver a tad better. Only using the free version of mConnect. Not yet playing with JPLAY on the iPad. Makes me wonder if the issue is a Roon thing (RAAT) or something else.

This hobby is always an interesting experience.

ngxant · December 13, 2025, 2:52am

Referring to Diretta as an “audio transmission protocol” involves several conceptual errors and ambiguities when examined against established standards in networking and digital audio engineering. In communications engineering, a protocol must have a clearly defined specification covering its operational layer, packet structure, synchronization mechanisms, error detection and control, flow control, and—critically—interoperability between independent implementations. Diretta does not publish any such standardized specification: there is no RFC, no platform-independent packet definition, and no provision for third-party client or server implementations. From a technical standpoint, Diretta is therefore not a network protocol but a proprietary audio transport mechanism implemented on top of existing TCP/IP, tightly coupled to the Windows operating system and dependent on kernel- and driver-level behavior.

A second major misconception lies in conflating “network audio transmission” with “audio processing optimization within a host system.” Descriptions commonly associated with Diretta, such as “direct kernel streaming” or “bypassing the OS audio stack,” primarily describe reductions in intermediate software layers, fewer context switches, and adjusted scheduling priorities for audio threads. These measures may affect how audio data are handled inside the operating system, but they do not alter the fundamental transmission characteristics of TCP/IP. Any potential benefit therefore arises from operating system–level optimization rather than from a change in network transport principles.

Another significant source of ambiguity concerns the use of the term jitter. In digital audio engineering, it is essential to distinguish between network jitter (packet timing variation), software-induced jitter (due to scheduling and buffering), and clock jitter at the DAC. In IP-based audio transport, packet jitter is absorbed by buffers and does not directly define the DAC clock; TCP/IP does not convey a physical clock. Without explicit clock distribution or time-synchronization mechanisms such as PTP, SyncE, or word clock, there is no technical basis for claiming control over DAC-level jitter. Collapsing these distinct forms of jitter into a single concept and attributing “jitter reduction” to Diretta at the transmission level is therefore ambiguous and lacks verifiable technical grounding.

The emphasis on kernel-level operation also leads to a common conceptual error: equating kernel execution with high-precision real-time behavior. The Windows kernel is not a real-time operating system and does not provide hard real-time guarantees. Kernel-level execution may reduce average latency or scheduling variability, but it does not ensure deterministic timing or strict deadlines, nor does it provide the time alignment achievable in professional networked audio systems based on AES67 or Dante with PTP synchronization.

A further issue arises from the lack of independent interoperability. Diretta operates only with licensed software and supported devices and does not allow independent third-party implementations. This directly contradicts the fundamental definition of a protocol, which implies a shared, implementable specification enabling communication between independently developed systems. Consequently, positioning Diretta alongside standardized audio networking protocols such as AES67, Dante, or Ravenna represents a category error. Those technologies are designed for full-scale audio networks with defined clock synchronization, scalability, and multi-vendor interoperability, whereas Diretta is a proprietary PC-to-DAC solution focused on software-level optimization.

In summary, the core conceptual mistake is labeling Diretta as an “audio transmission protocol.” In reality, it is a proprietary audio transport and processing mechanism implemented at the application and kernel levels, leveraging standard TCP/IP without defining any new network transmission standard. Claims regarding jitter control, time synchronization, or superiority over professional audio networking protocols are either ambiguous or unsupported by independently verifiable technical evidence.

Anupc · December 13, 2025, 3:34am

Totally agree.

In fact, there’s a thread on Roon community forum that objectively debunked any benefit from Diretta.

stevebythebay · December 13, 2025, 4:54am

My experience with the Diretta hardware/software led me to find that my current system performed well beyond what Diretta could provide. However, that’s but one system among so many other possible configurations. Our mind/body complexities suggests there are few, if any, testing scenarios that adequately explain why so many of us “hear” what others do, or do not.

Shakespeare’s Hamlet says:

“There are more things in heaven and earth, Horatio, / Than are dreamt of in your philosophy.” (Act 1, Scene 5)

Pretty well sums up my quandary. As an aside, I have begun to experiment with using an updated version of my Grimm MU1 streamer’s firmware. It offers support for DLNA/UPnP. Using minimServer on my NAS has, in conjunction with this new support, allowed me to hear this approach, in lieu of Roon playback of my local library. Using the free mconnect Player Lite, I find a sonically better alternative to Roon. Will test JPLAY in the future to determine if it offers any sonic difference against both Roon and mconnect.

Erwan · December 13, 2025, 9:41am

Well, you seem to be way down the rabbit hole already.
Given you already have a Vivaldi Stack, there’s almost no room for improvement on the digital side I’m afraid (but to listen to a Varèse or to listen to dCS competitors)…
Better for you to significantly experience new sound would be to listen to records new to you.
Better for you to significantly improve(sometimes just alter) sound quality would be to change the position of your loud speakers, room treatment, or to look for new loud speakers…
Or to try new cables (ls, interconnect, power…) if and only if you’re into cables; New Japanese Turnbull Audio brand comes with a growing hype nowadays, despite their ludicrous pricing (upto 50k for speaker cables, 20k for power, 10k for interconnect!!!) - there should be a good reason for this(?)…
Anyway, good luck with your journey…

glevethan · December 13, 2025, 7:45pm

Compared to Siltech Master Crown these are……reasonably priced

Anupc · December 13, 2025, 10:51pm

Precisely.

Objectively, the Vivaldi Upsampler is source bit-perfect from its AES output with any UPnP Server and Mosaic. Which means it’s literally impossible to improve the digital input stage.

Changing the transport software/protocol (like with Diretta), or changing the Control-Point software (like with JPlay), has no way to improve what’s already bit-perfect.

That said, Digital-Signal-Processing the source stream, like with the Grimm or Roon with DSP engaged, can certainly change or increase the sonic preference, but that’s no longer bit perfect.

I definitely agree, there are plenty of other parts of the audio chain to focus on to improve SQ rather than fiddling with the digital stream input stage to dCS kit.