Avoiding UB but "safe" data race in a lock-free slab allocator

⚓ rust    📅 2025-06-04    👤 surdeus    👁️ 2      

surdeus

I'm working on concurrent slab allocator operating on a pre-allocated memory block which has O(1) memory overhead (for types larger or equal to a usize) and initialization time with respect to the backing storage size. The current implementation uses a Treiber stack to maintain the free-list, with the next pointers being usize indices bit-tagged with the deallocation count of the slot to mitigate the ABA problem. I also plan to implement backoff strategies using thread-local state to reduce contention on the free-list head.

The API is quite simple:

  • Call AtomicSlab::<T>::new(storage) with a *mut [u8] pointer to pre-allocated storage
  • Allocate with slab.alloc(), getting an AtomicSlabPtr<T> which stores both a direct *mut T to the allocated slot, and the slot's bit-tagged index.
  • Free with slab.free(ptr), where ptr is the AtomicSlabPtr given by slab.alloc().

Code: Playground

In order to achieve O(1) memory overhead, I stored the free-list next pointer in a union with the storage slot for a value, which can safely be done for a non-concurrent slab allocator. However, in a concurrent scenario, we can create a data race:

  1. Thread A and B call alloc() and enter the CAS loop, reading the same free-list head.
  2. Thread A succeeds the CAS and becomes the owner of the memory slot corresponding to head.
  3. Thread A writes a value to head's memory slot.
  4. Thread B reads from the now-old head's next to prepare for the CAS.
  5. Thread B's tries to CAS the free-list head next pointer, which fails as its value of head is outdated.

So, steps 3 and 4 constitute a data race, which both Miri and TSAN are able to detect. I'm quite confident that when the CAS succeeds such a race cannot happen, so the data race is "safe" (see a proof sketch at line 221 in the playground code). But this doesn't change the fact that, from what I understand about the Rust/C++11 memory model, data races are always UB.

So, the question is: Is there a way to make this code UB-free while keeping things lock-free and with constant memory overhead? All the strategies I can think of involve some kind of spin locking.

There is also the possibility of making it impossible to get a mutable pointer from the AtomicSlabPtr<T>, instead providing Cell-like methods that copy values of T in and out while using a relaxed atomic operation for the lower part which overlaps the next pointer. However this prevents holding a Rust reference to the allocated value, so any safe API built on top of it would be quite limited.

5 posts - 2 participants

Read full topic

🏷️ rust_feed