Parsing on read with LazyCell

โš“ Rust    ๐Ÿ“… 2025-12-25    ๐Ÿ‘ค surdeus    ๐Ÿ‘๏ธ 1      

surdeus

Iโ€™m analysing some large tab-separated log files in various ways, and due to (1) the size of the logs, (2) the cost of parsing some fields, and (3) each analysis only accessing a small subset of the fields, I want to parse individual fields only when theyโ€™re read.

I was able to achieve what I wanted with code of the following form (here simplified to parsing a couple of trivial values to avoid cluttering this post with unnecessary detail). However, the boxed closure feels messy to me. Is this just inherent complexity, or am I missing a more elegant way of achieving this?

use std::cell::LazyCell;
use std::num::{ParseFloatError, ParseIntError};
use std::str::FromStr;

fn main() {
    // in real code this string would come from BufRead's Lines iterator
    let test = "12\t3.45".to_owned();
    
    let values = Values::from_str(&test).expect("String has sufficient values");

    assert_eq!(&Ok(12), values.a());
    assert_eq!(&Ok(3.45), values.b());
}

type ParseResult<T> = Result<T, <T as FromStr>::Err>;

type LazyParse<'source, T> =
    LazyCell<ParseResult<T>, Box<dyn FnOnce() -> ParseResult<T> + 'source>>;

// pretend there are many more fields and the types being parsed are much more
// expensive to parse than u32/f32
pub struct Values<'source> {
    a: LazyParse<'source, u32>,
    b: LazyParse<'source, f32>,
}

#[derive(Debug)]
pub struct ParseValuesError;

impl<'source> Values<'source> {
    pub fn from_str(s: &'source str) -> Result<Self, ParseValuesError> {
        let mut parts = s.split('\t');
        let a = parts.next().ok_or(ParseValuesError)?;
        let b = parts.next().ok_or(ParseValuesError)?;

        Ok(Self {
            a: LazyCell::new(Box::new(|| a.parse())),
            b: LazyCell::new(Box::new(|| b.parse())),
        })
    }

    pub fn a(&self) -> &Result<u32, ParseIntError> {
        &self.a
    }

    pub fn b(&self) -> &Result<f32, ParseFloatError> {
        &self.b
    }
}

3 posts - 3 participants

Read full topic

๐Ÿท๏ธ Rust_feed