Any way to force the *omission* of frame pointers?
⚓ Rust 📅 2025-10-30 👤 surdeus 👁️ 7I'm using Rust to write code for an embedded target of mine -- a CPU made on a digital circuit simulator, implementing a subset of Thumb (specifically, thumbv6m-none-eabi).
The CPU is slow (~40kHz) so every wasted instruction adds up to a noticeable slowing down of stuff I run. I understand this is nowhere near anything Rust was designed to run on.
There is something I'm blocking on: all functions, even leaves, have a frame pointer, which I don't need.
For example, here's a small function that implements division using a hardware divider mapped to memory:
#[unsafe(export_name = "__aeabi_uidiv")]
pub extern "C" fn __aeabi_uidiv(a: u32, b: u32) -> u32 {
let mut res;
unsafe {
core::arch::asm!("ldr {res}, [{addr}]",
addr = in(reg) 0xffff_ff20,
res = lateout(reg) res,
in("r0") a,
in("r1") b,
)
}
res
}
Since the ABI dictates parameters to be passed in sequential registers (r0, r1) and the result returned to r0, I would expect the following code:
__aeabi_uidiv:
movs r2, #223
mvns r2, r2 ; simply getting the address of the MMIO port
ldr r0, [r2] ; r0 and r1 are already populated with the parameters
bx lr
This is the smallest possible Thumb code for what I'm trying to do.
But here is what I'm getting with opt-level=3, lto="fat", --release:
__aeabi_uidiv:
.fnstart
.save {r7, lr}
push {r7, lr}
.setfp r7, sp
add r7, sp, #0
movs r2, #223
mvns r2, r2
@APP
ldr r0, [r2]
@NO_APP
pop {r7, pc}
There is no reason for lr to be saved here, since the function is a leaf. But most importantly, it uses r7 as a frame pointer when I would expect no such thing to be done with opt-level=3. That's 2 additional instructions (and since push/pop take up multiple cycles, it's actually more like 6). In relative terms, 50% to 150% more instructions!
There is a codegen parameter to force frame pointers (-C force-frame-pointers), but nothing to force their omission, to my knowledge.
Is there anything I can pass to the compiler to force it to generate the optimal code? Otherwise, I'll have to rewrite all those small functions to assembly. For the example I gave here, that's not really an issue, but I have more complex functions around, such as memcpy4, that end up completely botched if I write them in Rust. Of course, this is the kind of stuff global_asm! is for, but it'd be really nice to write this stuff in Rust.
Thanks!
edit: edited example code to use constant to make point clearer
7 posts - 4 participants
🏷️ Rust_feed