How to correctly use asm! as memory barrier (LLVM question)

⚓ Rust    📅 2025-07-21    👤 surdeus    👁️ 2      

surdeus

This is in some respect a follow-up to Compiler fence + DMA - just focused on the actual inner workings of asm!(). To strengthen my intuition about asm! as a memory barrier, I was playing around with it for a bit longer.

It seems that the analogy to atomic fences from the cited topic cannot fully explain the behavior of the following sample program - see the inline comments:

#![no_std]
use core::arch::asm;

#[unsafe(no_mangle)]
pub unsafe fn do_dma(dma_ptr: *mut *mut u32, dma_start_cmd: *mut bool, dma_done_ptr: *const bool) {
    let mut buf: [u8;3] = [0;3];
    buf[0] = 42; // this is eliminated unless one of the fixes is implemented

    pass_buf_to_hardware(dma_ptr, dma_start_cmd, &mut buf);
    wait_for_hardware(dma_done_ptr);
    
    // lifetime of buffer ends here in LLVM IR - no matter the position and variant of asm! statements.
}

#[unsafe(no_mangle)]
unsafe fn pass_buf_to_hardware(dma_ptr: *mut *mut u32, dma_start_cmd: *mut bool, buf_ptr: *mut [u8]) {
    asm!(""); // doesn't work - eliminates array initialization despite asm "~memory" clobber
    // asm!("/* {} */", in(reg) buf_ptr.cast::<*mut u32>()); // potential fix #1
    dma_ptr.write_volatile(buf_ptr.cast()); // Passes a ptr to an uninit array unless one of the fixes is implemented.
    // asm!(""); // potential fix #2
    dma_start_cmd.write_volatile(true);
}

#[unsafe(no_mangle)]
#[inline(never)]
unsafe fn wait_for_hardware(dma_done_ptr: *const bool) {
    while !dma_done_ptr.read_volatile() {}
    // asm!(""); // potential fix #3
}

In an attempt to understand what's going on, I was looking at the LLVM IR. Result: Any empty asm! boils down to asm sideeffect alignstack inteldialect "", "~{dirflag},~{fpsr},~{flags},~{memory}"() - no matter where I place it.

Here my specific questions:

  • Why does LLVM assume that the contents of the stack-allocated buffer will not be accessed despite an asm ... ~memory statement placed after it? According to the LLVM docs, ~memory should force LLVM to assume that any memory could be accessed in the asm! block. This is more than just placing a barrier to re-order optimizations. Shouldn't it force LLVM to assume that the stack-allocated, initialized buffer could also be read?
  • Fix #1: Why does an additional in-parameter to the asm! statement in the exact same location, just with the pointer to (but not the content of!) the buffer make a difference?
  • Fix #2: Why does a second empty asm!("") statement anywhere after escaping the pointer to (not the content of) the buffer to a volatile write make a difference? Shouldn't the ~memory clobber behave the same no matter where it is placed as long as it is placed anywhere after the initialization of the buffer?
  • Fix #3: How can an asm! statement in a different function spookily influence optimization of the parent function w/o any obvious change to the generated IR of the sub-function? Note: No inlining - confirmed in IR and assembly.

Obviously my understanding is still incomplete. Can anyone help me understand this?

2 posts - 1 participant

Read full topic

🏷️ rust_feed