Parsing on read with LazyCell
โ Rust ๐ 2025-12-25 ๐ค surdeus ๐๏ธ 1Iโm analysing some large tab-separated log files in various ways, and due to (1) the size of the logs, (2) the cost of parsing some fields, and (3) each analysis only accessing a small subset of the fields, I want to parse individual fields only when theyโre read.
I was able to achieve what I wanted with code of the following form (here simplified to parsing a couple of trivial values to avoid cluttering this post with unnecessary detail). However, the boxed closure feels messy to me. Is this just inherent complexity, or am I missing a more elegant way of achieving this?
use std::cell::LazyCell;
use std::num::{ParseFloatError, ParseIntError};
use std::str::FromStr;
fn main() {
// in real code this string would come from BufRead's Lines iterator
let test = "12\t3.45".to_owned();
let values = Values::from_str(&test).expect("String has sufficient values");
assert_eq!(&Ok(12), values.a());
assert_eq!(&Ok(3.45), values.b());
}
type ParseResult<T> = Result<T, <T as FromStr>::Err>;
type LazyParse<'source, T> =
LazyCell<ParseResult<T>, Box<dyn FnOnce() -> ParseResult<T> + 'source>>;
// pretend there are many more fields and the types being parsed are much more
// expensive to parse than u32/f32
pub struct Values<'source> {
a: LazyParse<'source, u32>,
b: LazyParse<'source, f32>,
}
#[derive(Debug)]
pub struct ParseValuesError;
impl<'source> Values<'source> {
pub fn from_str(s: &'source str) -> Result<Self, ParseValuesError> {
let mut parts = s.split('\t');
let a = parts.next().ok_or(ParseValuesError)?;
let b = parts.next().ok_or(ParseValuesError)?;
Ok(Self {
a: LazyCell::new(Box::new(|| a.parse())),
b: LazyCell::new(Box::new(|| b.parse())),
})
}
pub fn a(&self) -> &Result<u32, ParseIntError> {
&self.a
}
pub fn b(&self) -> &Result<f32, ParseFloatError> {
&self.b
}
}
3 posts - 3 participants
๐ท๏ธ Rust_feed