Rust does not eagerly free stack space

⚓ Rust    📅 2025-12-08    👤 surdeus    👁️ 10      

surdeus

Warning

This post was published 61 days ago. The information described in this article may have changed.

Hello, I am writing embedded and one single problem continuously eats hours of my time.

Imagine some function needs to make a big (relatively) allocation on the stack. It allocates on the stack, uses the allocated value and no longer needs it. Value has no drop and it is scoped in { } inside the function, so there is no need to keep it around until the end of the function.

But this is exactly what Rust does. I inspect assembly and see that Rust always generates one sp bump at the start for the worst case and will not deallocate anything back. The problems begin when I call next functions - they run on a stack with less space and fail. I constantly need to go around and toke chunks of code into inline(never) functions just to salvage the situation a little. Is it by design? This is horrible. I would even accept if Rust would generage memmove to defragment the stack to reduce stack usage if it is required.

Maybe I am missing something? Some compilation flag that will magically solve all problems?

Here is a simple test, on std. It would be great if blocks behave as inline(never) function by default. It would save tons of RAM and time.

fn main() {
    let mut pointers = [0usize; 100];
    test1_1(&mut pointers);

    println!("Test 1 Results:");
    print_abs_and_diff(&mut pointers[0..4]);

    test2_1(&mut pointers);
    println!("\nTest 2 Results:");
    print_abs_and_diff(&mut pointers[0..4]);
}

fn print_abs_and_diff(pointers: &mut [usize]) {
    println!("Absolute stack pointers:");
    for (i, &ptr) in pointers.iter().enumerate() {
        println!("Depth {}: {:#x}", i, ptr);
    }
    println!("\nDifferences between consecutive stack pointers:");
    for i in 1..pointers.len() {
        let diff = pointers[i - 1] as isize - pointers[i] as isize;
        println!("Depth {} to {}: {}", i - 1, i, diff);
    }
}

///////////////////////////////////

#[inline(never)]
fn test1_1(p: &mut [usize]) {
    p[0] = get_stack_pointer();
    test1_2(p);
}

#[inline(never)]
fn test1_2(p: &mut [usize]) {
    {
        p[1] = get_stack_pointer();
        let mut big = [0u8; 3 * 1000 * 1000];
        core::hint::black_box(&mut big);
    }
    p[2] = get_stack_pointer();
    test1_3(p);
}

#[inline(never)]
fn test1_3(p: &mut [usize]) {
    p[3] = get_stack_pointer();
}

///////////////////////////////

#[inline(never)]
fn test2_1(p: &mut [usize]) {
    p[0] = get_stack_pointer();
    test2_2(p);
}

#[inline(never)]
fn test2_2(p: &mut [usize]) {
    #[inline(never)]
    fn big(p: &mut [usize]) {
        p[1] = get_stack_pointer();
        let mut big = [0u8; 3 * 1000 * 1000];
        core::hint::black_box(&mut big);
    }
    p[2] = get_stack_pointer();
    test2_3(p);
}

#[inline(never)]
fn test2_3(p: &mut [usize]) {
    p[3] = get_stack_pointer();
}

///////////////////////////////////

#[inline(always)]
fn get_stack_pointer() -> usize {
    let sp: usize;
    unsafe {
        std::arch::asm!("mov {}, rsp", out(reg) sp);
    }
    sp
}

(Profile is Oz)

By the way, in debug builds each match arm may gen a separate allocation on the stack instead of the worst case :sweat_smile:

5 posts - 5 participants

Read full topic

🏷️ Rust_feed