Soundness check for split_lifetime!
โ Rust ๐ 2026-06-22 ๐ค surdeus ๐๏ธ 1A common problem I encounter in high performance networking code and future implementations is this: I have a mut ref. I want to do something with it that conditionally returns a derived mut ref. On the happy path, I want to return the derived ref with the full lifetime of the original ref. On the unhappy path, I want to do something else with the original ref. Here's a minimal example:
fn try_from_utf8(bytes: &mut [u8]) -> Result<&mut str, &mut [u8]> {
match str::from_utf8_mut(bytes) {
Ok(str) => Ok(str),
Err(_) => Err(bytes), // Oops, this doesn't work because the Ok path
// "consumed" the ref even though it isn't
// used on this path. Workarounds require either
// 1) Checking the string twice
// 2) Unsafety
}
}
This is a minimal example to illustrate the problem. There may be better ways to solve this specific problem. Here is a non-contrived example "in the wild": tokio/tokio/src/io/util/fill_buf.rs at master ยท tokio-rs/tokio ยท GitHub.
In the case of this string example above, the unsafe version isn't too bad because of from_utf8_unchecked_mut, but the general case requires transmute, which is particularly difficult to reason about (e.g. the tokio example).
As I understand it, Polonius will resolve all this, but I wanted a way to do this today without double processing or unsafety, so I made a utility called split_lifetime. Usage looks something like this:
fn try_from_utf8(bytes: &mut [u8]) -> Result<&mut str, &mut [u8]> {
// `[bytes]` is the lifetime we're splitting
// `str::from_utf8_mut(bytes).ok()` is the "body" using bytes.
// It gives back an `Option` derived from `bytes`.
// `split_lifetime` turns that into a `Result` containing either
// the extended derived value or the original ref
split_lifetime!([bytes] str::from_utf8_mut(bytes).ok())
}
The purpose of this post is to get some additional eyes on the implementation and make sure it is sound, the safe API cannot produce UB, and the safety rules are complete. Here is the implementation:
// In the hypothetical library
pub mod lib {
use std::marker::PhantomData;
use std::mem::MaybeUninit;
// Trait for transmuting lifetimes within types without otherwise changing the type.
//
// Safety:
// 1. Must only be implemented for types that differ only by (at least one) lifetime.
// E.g.
// Ok: &'a T -> &'b T
// Ok: Foo<'a> -> Foo<'b>
// Ok: Bar<'a, 'b> -> Bar<'c, 'b>
// Ok: Bar<'a, 'b> -> Bar<'c, 'c>
// Ok: &'a Foo<'b> -> &'c Foo<'b>
// Ok: &'a Foo<'b> -> &'a Foo<'c>
// Ok: &'a Foo<'b> -> &'c Foo<'c>
// Ok: Vec<&'a T> -> Vec<&'b T>
// Bad: &'a T -> Foo<'b> // This is not allowed even if Foo is repr(transparent) around a &'a T.
// 2. The source and dest types must have identical layouts (I believe this currently holds for all pairs that satisfy #1 until we get specialization).
pub unsafe trait TransmuteLifetime<'a> {
type Output;
}
// Impl just what is needed for this example. In a real lib this would be implemented for all types in the stdlib
//
// Safety: Transmuting `&mut T` -> `&'a mut T` is valid because they only differ by lifetime
// and share the same layout
unsafe impl<'a, T> TransmuteLifetime<'a> for &mut T where T: ?Sized + 'a {
type Output = &'a mut T;
}
// The actual function responsible for transmuting lifetimes
//
// Safety: The input must remain valid for the lifetime(s) contained in T::Output.
pub unsafe fn transmute_lifetime<'a, T>(value: T) -> T::Output where T: TransmuteLifetime<'a> {
const {
// not strictly necessary if TransmuteLifetime is implemented correctly, but helps
// safeguard against mistakes
assert!(size_of::<T>() == size_of::<T::Output>());
}
// MaybeUninit so T isn't used after the read below
let mut value = MaybeUninit::new(value);
// Safety: According to the safety requirements of TransmuteLifetime<'a>, T and T::Output
// have the same layout.
unsafe{ (value.as_mut_ptr() as *mut T::Output).read() }
}
// Helper for capturing a specific unnamed lifetime and transmuting to it
#[doc(hidden)]
#[derive(Copy, Clone)]
pub struct LifetimeToken<'a>(PhantomData<&'a &'a mut ()>);
impl<'a> LifetimeToken<'a> {
pub fn new<T>(_: &&'a mut T) -> Self where T: ?Sized {
Self(PhantomData)
}
// Safety: The safety rules of this function are inherited from `transmute_lifetime`.
pub unsafe fn transmute_lifetime<T>(self, value: T) -> T::Output where T: TransmuteLifetime<'a> {
unsafe{ transmute_lifetime(value) }
}
}
// The actual interesting bit. Given a mut ref, executes a body that gives back a Some(_) or None.
// On the `Some` path, extends the lifetime of the returned value to the full lifetime of the original ref.
// On the `None` path, gives back the original ref (with its full lifetime).
#[macro_export]
macro_rules! split_lifetime {
([$value:ident] $body:expr) => {
{
// ensure value is a mut ref
let $value: &mut _ = $value;
// create a token to refer to its lifetime
let token = $crate::lib::LifetimeToken::new(&$value);
// wrap the ref in MaybeUninit so it doesn't get accidentally used on the `Some`
// path
let mut $value = ::core::mem::MaybeUninit::new($value);
let res = {
// put the ref in scope for the body
// Safety: `$value` always contains a valid value because we just created it initialized
let $value = unsafe{ &mut **$value.assume_init_mut() };
// execute the body
let res = $body;
// extend the lifetime of the result on the `Some` path.
// Safety: `$value` is not used again on the `Some` path, so we can extend this
// lifetime without aliasing `$value`
res.map(|x| unsafe{ token.transmute_lifetime(x) })
};
// on the `None` path, get back the original ref
// Safety: We never extended the derived lifetime on the `None` path,
// so we can get back our original ref (as we only temporarily lent it out)
res.ok_or_else(|| unsafe{ $value.assume_init() })
}
};
}
}
// In user code
fn try_from_utf8(foo: &mut [u8]) -> Result<&mut str, &mut [u8]> {
// Look ma! One pass mut try parse without unsafety
split_lifetime!([foo] str::from_utf8_mut(foo).ok())
}
pub fn main(){
assert!(matches!(try_from_utf8(&mut [72, 101, 108, 108, 111]), Ok(x) if x == "Hello"));
assert!(matches!(try_from_utf8(&mut [255, 255, 255, 255]), Err(x) if x == &[255, 255, 255, 255]));
}
Note that the real implementation would accept both &mut T and Pin<&mut T> as the input ref via a sealed trait, but that code is omitted here for brevity.
Also note that split_lifetime is a macro rather than a function accepting a FnOnce because the relevant generics would be extremely complex (maybe impossible? I couldn't get something that works with non-static generic inputs and outputs due to the lack of where clauses on HRTBs).
I would greatly appreciate any thoughts on the safety/soundness of this implementation, especially safety counterexamples if they exist.
2 posts - 2 participants
๐ท๏ธ Rust_feed