Compiler fence + DMA

⚓ Rust    📅 2025-07-18    👤 surdeus    👁️ 3      

surdeus

Info

This post is auto-generated from RSS feed The Rust Programming Language Forum - Latest topics. Source: Compiler fence + DMA

I'm trying to follow the discussion in https://github.com/rust-lang/unsafe-code-guidelines/issues/321 from the perspective of a Rust user - not a compiler contributor. And to be honest it is plain over my head it seems.

More specifically I gave a concrete example in some comment to that issue. I understand that the issue is not the right place to ask for clarification by someone who doesn't have the necessary compiler background to follow the discussion properly. So I hope for some help from this forum.

Repeating the example I gave here for convenience

a) allocate some largish zero-copy buffer - possibly on the stack or statically (array of bytes, say a buffer that can hold a full IP packet including driver headroom/tailroom).
b) write to buffer across application code, network layers and libraries (e.g. application, rtos, smoltcp, soc-specific driver).
c) save pointer to the first byte of the buffer cast to u32 to MMIO register (volatile write)
d) wait for DMA to finish (interrupt)
e) deallocate or re-use buffer

And the inverse for an inbound packet:

a) allocate buffer
c) save pointer as u32 to MMIO register (volatile write)
d) wait for DMA to finish
b) parse buffer across application and network layers
e) de-allocate buffer

The typical solution for that in the embedded ecosystem seems to be to place a compiler fences or hardware/memory fences around the DMA access to synchronize with prior/subsequent memory accesses to the DMA buffer.

However, the issue I'm pointing to, seems to insinuate that this is not the proper way of synchronizing access, at least not after the change proposed in that issue. They seem to propose a different approach with macros, volatile accesses, assembly with proper clobbers, fences, etc.

But the abbreviated form in which those solutions are proposed, assuming compiler and/or language knowledge I don't have, is not accessible to me. Can anyone try to explain this in language that assumes less prior knowledge? I'd love to transform their comments into real code in the context of drivers similar to the ones linked above as examples.

Note: I am aware of and comfortable with the C20 memory model (Acquire, Release, ...) and I brought up the issue in the first place because the current use of fences seems to be invalid in the cited sources. But I'm not acquainted in detail with the fine semantical details of bringing assembly with specifically crafted "clobbers" and macros into the picture.

40 posts - 10 participants

Read full topic

🏷️ rust_feed