How are enum discriminants sized in rust?

⚓ rust    📅 2025-06-18    👤 surdeus    👁️ 2      

surdeus

So I'm finding an interesting incompatibility in the code generation/optimization that's being done between 32-bit vs 64-bit, specifically relating to how the rust compiler views the size of the enum discriminant, i.e. let's say an Option enum where the tag can theoretically fit in a byte tag value. Consider the following code (compiled as --release)

use std::sync::Mutex;

const SIZE: usize = 256;

#[repr(C, align(8))]
struct Foobar {
    d: [u8; SIZE],
    a: u64,
    b: u64,
    c: u32,
}

#[inline(never)]
fn get_int(x: usize) -> usize {
    let mut ret = x;
    unsafe {
        core::arch::asm!("add {}, 4" , inout(reg) ret)
    }
    ret
}

#[inline(never)]
fn vvv(mybar: &Option<Foobar>) {
    println!("HELLO {}", mybar.as_ref().unwrap().a);
    println!("HELLO {}", mybar.as_ref().unwrap().b);
    println!("HELLO {}", mybar.as_ref().unwrap().c);
}

fn consume(data: &[u8]) {
    for b in data {
        print!("{b:02x}");
    }
    println!("");
}

static HELLO: Mutex<Option<Foobar>> = Mutex::new(None);
static BYE: Mutex<Option<Foobar>> = Mutex::new(None);

pub fn main() {
    let mut a = Foobar {
        a: get_int(1) as u64,
        b: get_int(2) as u64,
        c: get_int(3) as u32,
        d: [0; SIZE],
    };

    for i in 0..get_int(23) {
        a.d[i as usize % SIZE] += i as u8;
    }

    {
        const SIZE2: usize = size_of::<Option<Foobar>>();
        let mut data = [0xcc as u8; SIZE2];
        for i in 0..get_int(26) {
            data[i % SIZE2] = data[i % SIZE2].wrapping_add(i as u8);
        }
        consume(&data);
    }

    unsafe { core::arch::asm!("") }

    *HELLO.lock().unwrap() = Some(a);

    unsafe { core::arch::asm!("") }

    let tag = unsafe {
        let tmp = HELLO.lock().unwrap();
        *(&*tmp as *const _ as *const u32)
    };
    println!("The tag of HELLO: 0x{tag:08x}");

    #[allow(invalid_reference_casting)]
    let tag = unsafe {
        let tmp = HELLO.lock().unwrap();
        *(&*tmp as *const _ as *const u32 as *mut u32) ^= 0xff00;
        *(&*tmp as *const _ as *const u32)
    };
    println!("The tag of HELLO (after mutating): 0x{tag:08x}");

    unsafe { core::arch::asm!("") }
    println!("Attempting to dereference HELLO");
    vvv(&*HELLO.lock().unwrap());

    unsafe { core::arch::asm!("") }
    println!("Attempting to dereference HELLO in second way");
    let ptr = &HELLO as *const _ as usize;
    let ptr2 = unsafe {
        &*(ptr as *const Mutex<Option<Foobar>>)
    };
    let mut hello_tmp = ptr2.lock().unwrap();
    let x = hello_tmp.as_mut().unwrap();
    println!("HELLO {}", x.a);

    println!("Attempting to dereference BYE (which should panic)");
    vvv(&*BYE.lock().unwrap());
}

On x86_64-unknown-linux-gnu it will crash when dereferencing HELLO, because it seems like in that particular case, the code generation would read the Option enum tag as a u32:

cccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
The tag of HELLO: 0x00000001
The tag of HELLO (after mutating): 0x0000ff01
Attempting to dereference HELLO
HELLO 5
HELLO 6
HELLO 7
Attempting to dereference HELLO in second way

thread 'main' panicked at undefined_behavior/src/main.rs:93:32:
called `Option::unwrap()` on a `None` value
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Interestingly, the same exact code on i686-unknown-linux-gnu views the enum tag as a u8 and would potentially have uninitialized data in the tag location. It also would crash only when it tries to dereference BYE.

cccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
The tag of HELLO: 0xcecdcc01
The tag of HELLO (after mutating): 0xcecd3301
Attempting to dereference HELLO
HELLO 5
HELLO 6
HELLO 7
Attempting to dereference HELLO in second way
HELLO 5
Attempting to dereference BYE (which should panic)

thread 'main' panicked at undefined_behavior/src/main.rs:24:41:
called `Option::unwrap()` on a `None` value
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Is there any reason why between 32-bit and 64-bit there is this discrepancy? Also it is worth noting that there is also certain cases in the 64-bit mode where rust will also only check a byte comparison as opposed to a full u32 check.

4 posts - 3 participants

Read full topic

🏷️ rust_feed