Avoiding UB but "safe" data race in a lock-free slab allocator
⚓ Rust 📅 2025-06-04 👤 surdeus 👁️ 11I'm working on concurrent slab allocator operating on a pre-allocated memory block which has O(1) memory overhead (for types larger or equal to a usize) and initialization time with respect to the backing storage size. The current implementation uses a Treiber stack to maintain the free-list, with the next pointers being usize indices bit-tagged with the deallocation count of the slot to mitigate the ABA problem. I also plan to implement backoff strategies using thread-local state to reduce contention on the free-list head.
The API is quite simple:
- Call
AtomicSlab::<T>::new(storage)with a*mut [u8]pointer to pre-allocated storage - Allocate with
slab.alloc(), getting anAtomicSlabPtr<T>which stores both a direct*mut Tto the allocated slot, and the slot's bit-tagged index. - Free with
slab.free(ptr), whereptris theAtomicSlabPtrgiven byslab.alloc().
Code: Playground
In order to achieve O(1) memory overhead, I stored the free-list next pointer in a union with the storage slot for a value, which can safely be done for a non-concurrent slab allocator. However, in a concurrent scenario, we can create a data race:
- Thread A and B call
alloc()and enter the CAS loop, reading the same free-listhead. - Thread A succeeds the CAS and becomes the owner of the memory slot corresponding to
head. - Thread A writes a value to
head's memory slot. - Thread B reads from the now-old head's
nextto prepare for the CAS. - Thread B's tries to CAS the free-list head
nextpointer, which fails as its value ofheadis outdated.
So, steps 3 and 4 constitute a data race, which both Miri and TSAN are able to detect. I'm quite confident that when the CAS succeeds such a race cannot happen, so the data race is "safe" (see a proof sketch at line 221 in the playground code). But this doesn't change the fact that, from what I understand about the Rust/C++11 memory model, data races are always UB.
So, the question is: Is there a way to make this code UB-free while keeping things lock-free and with constant memory overhead? All the strategies I can think of involve some kind of spin locking.
There is also the possibility of making it impossible to get a mutable pointer from the AtomicSlabPtr<T>, instead providing Cell-like methods that copy values of T in and out while using a relaxed atomic operation for the lower part which overlaps the next pointer. However this prevents holding a Rust reference to the allocated value, so any safe API built on top of it would be quite limited.
5 posts - 2 participants
🏷️ rust_feed