On SpinLock performance

⚓ Rust    📅 2025-10-24    👤 surdeus    👁️ 2      

surdeus

Info

This post is auto-generated from RSS feed The Rust Programming Language Forum - Latest topics. Source: On SpinLock performance

I was experimenting with atomic types and, by pure chance, I stumbled upon a strange way to improve the performance of my naive spin lock, (and by quite a margin >10x on a couple of machines I tested): simply adding a sleep. I’m sure this is a known phenomenon, but I couldn’t find much about it, which is why I’m posting here.

From what I’ve read, spinlocks are sometimes frowned upon, but I wrote this as an MVP to show what I’m talking about, so, as a disclaimer, this isn’t representative of my real use case; . Anyway, I’ve included my implementation and the tests I’ve been using to benchmark it.

use std::{
    cell::UnsafeCell,
    marker::PhantomData,
    ops::{Deref, DerefMut},
    sync::atomic::{AtomicBool, Ordering},
    time::Duration,
};

struct SpinLock<T> {
    inner: UnsafeCell<T>,
    locked: AtomicBool,
    with_sleep: bool,
}

impl<T> SpinLock<T> {
    pub fn new(inner: T, with_sleep: bool) -> Self {
        Self {
            inner: UnsafeCell::new(inner),
            locked: AtomicBool::new(false),
            with_sleep,
        }
    }

    pub fn lock(&self) -> LockGuard<'_, T> {
        while self.locked.swap(true, Ordering::Acquire) {
            if self.with_sleep {
                std::thread::sleep(Duration::from_nanos(1));
            }
            std::hint::spin_loop();
        }

        LockGuard {
            lock: self,
            _p: PhantomData,
        }
    }
}

unsafe impl<T> Sync for SpinLock<T> where T: Send {}

struct LockGuard<'a, T> {
    lock: &'a SpinLock<T>,
    _p: PhantomData<&'a mut T>,
}

impl<T> Deref for LockGuard<'_, T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        unsafe { &*self.lock.inner.get() }
    }
}

impl<T> DerefMut for LockGuard<'_, T> {
    fn deref_mut(&mut self) -> &mut Self::Target {
        unsafe { &mut *self.lock.inner.get() }
    }
}

impl<T> Drop for LockGuard<'_, T> {
    fn drop(&mut self) {
        self.lock.locked.store(false, Ordering::Release);
    }
}

#[cfg(test)]
mod nascar {
    use crate::lock::SpinLock;
    use std::sync::Mutex;

    const THREADS: usize = 4;
    const ITERATIONS: usize = 300000;

    #[test]
    fn spinlock_slow() {
        let q = SpinLock::new(Box::new(0), false);

        std::thread::scope(|s| {
            for _ in 0..THREADS {
                s.spawn(|| {
                    for _ in 0..ITERATIONS {
                        **q.lock() += 1;
                    }
                });
            }
        });

        assert_eq!(**q.lock(), ITERATIONS * THREADS);
    }

    #[test]
    fn spinlock_fast() {
        let q = SpinLock::new(Box::new(0), true);

        std::thread::scope(|s| {
            for _ in 0..THREADS {
                s.spawn(|| {
                    for _ in 0..ITERATIONS {
                        **q.lock() += 1;
                    }
                });
            }
        });

        assert_eq!(**q.lock(), ITERATIONS * THREADS);
    }

    #[test]
    fn mutex() {
        let q = Mutex::new(Box::new(0));

        std::thread::scope(|s| {
            for _ in 0..THREADS {
                s.spawn(|| {
                    for _ in 0..ITERATIONS {
                        **q.lock().unwrap() += 1;
                    }
                });
            }
        });

        assert_eq!(**q.lock().unwrap(), THREADS * ITERATIONS);
    }
}

Is the implementation correct or is this a fluke? Why does that happens, and does it happens on all architectures (mine is x86_64)? If inserting a sleep makes it faster, is there an heuristic on how to pick a value for the sleep? is there any other mechanism to achieve a better result other than sleep? (e.g. I tried a short loop followed by yield_now() but the result wasn't impressive, and also _mm_pause() with slightly better results)

Any feedback is appreciated :slight_smile:

2 posts - 2 participants

Read full topic

🏷️ Rust_feed