Should I parse as graphemes?

⚓ Rust    📅 2025-08-18    👤 surdeus    👁️ 4      

surdeus

For parsing data coming from a request as bytes, I use result.chars().take(240).collect().as_bytes() since using result[..240] would panic if it is not on a UTF-8 character boundary.

But I've read in the documentation that chars() are not perceived characters.

I guess this gives away the answer, for a user we need to print user-perceived characters which would be graphemes. In this case one needs the unicode-segmentation crate.

Something like:

use unicode_segmentation::UnicodeSegmentation;

let content: String = t.content.graphemes(true).take(240).collect();

Is this rationale correct, or should any other way be preferred?

5 posts - 3 participants

Read full topic

🏷️ Rust_feed